国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

csv - python多列存取爬蟲(chóng)網(wǎng)頁(yè)?
天蓬老師
天蓬老師 2017-04-18 10:25:51
0
1
656

爬蟲(chóng)抓取的資料想分列存取在tsv上,試過(guò)很多方式都沒(méi)有辦法成功存存取成兩列資訊。
想存取為數(shù)字爬取的資料一列,底下類型在第二列

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import csv

html = urlopen("http://www.app12345.com/?area=tw&store=Apple%20Store")
bs0bj = BeautifulSoup (html)


def GPname():
    GPnameList = bs0bj.find_all("dd",{"class":re.compile("ddappname")})
    str = ''
    for name in GPnameList:
        str += name.get_text()
        str += '\n'
        print(name.get_text())

    return str


def GPcompany():

    GPcompanyname = bs0bj.find_all("dd",{"style":re.compile("color")})
    str = ''
    for cpa in GPcompanyname:
        str += cpa.get_text()
        str += '\n'
        print(cpa.get_text())
    return str




with open('0217.tsv','w',newline='',encoding='utf-8') as f:
    f.write(GPname())
    f.write(GPcompany())

f.close()

可能對(duì)zip不熟悉,存取下來(lái)之后變成一個(gè)字一格
也找到這篇參考,但怎么嘗試都沒(méi)有辦法成功
https://segmentfault.com/q/10...

天蓬老師
天蓬老師

歡迎選擇我的課程,讓我們一起見(jiàn)證您的進(jìn)步~~

reply all(1)
劉奇

It’s easier to write csv files. Your structural data should be like this [["1. Dongsen News Cloud", "News"], ["2. Dawn of World (Dawn of world)", "Game"]]

from urllib import urlopen
from bs4 import BeautifulSoup
import re
import csv

html = urlopen("http://www.app12345.com/?area=tw&store=Apple%20Store")
bs0bj = BeautifulSoup (html)
GPnameList = [name.get_text() for name in bs0bj.find_all("dd",{"class":re.compile("ddappname")})]
GPcompanyname = [cpa.get_text() for cpa in bs0bj.find_all("dd",{"style":re.compile("color")})]

data = '\n'.join([','.join(d) for d in zip(GPnameList, GPcompanyname)])
with open('C:/Users/sa/Desktop/0217.csv','wb') as f:
     f.write(data.encode('utf-8'))
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template