jizzjizz日本高潮喷水,99久久精品毛片免费播放高潮 ,a级国产乱理伦片

巴扎黑2017-06-15 09:23:36 3樓

圖片上已經(jīng)說(shuō)了，檔案以錯(cuò)誤的編碼形式載入了，說(shuō)明你多進(jìn)程寫(xiě)入的時(shí)候，編碼不是utf-8

點(diǎn)贊 +0

新增回覆

世界只因有你2017-06-15 09:23:36 2樓

文件第一行新增:

#coding: utf-8

點(diǎn)贊 +0

新增回覆

我想大聲告訴你2017-06-15 09:23:36 1樓

開(kāi)啟同一個(gè)檔案，相當(dāng)危險(xiǎn)，出錯(cuò)機(jī)率相當(dāng)大，
多執(zhí)行緒不出錯(cuò)，極有可能是GIL,
多進(jìn)程沒(méi)有鎖，因此容易出錯(cuò)了。

url_text = codecs.open('url.txt','a','utf-8')

建議改為生產(chǎn)者消費(fèi)都模式!

比如這樣

# -*- coding: utf-8 -* -
import time
import os
import codecs
import multiprocessing
import requests
from bs4 import BeautifulSoup

baseurl = ''
baseurl1 = ''
baseurl2 = ''
pageurl = ''
searchword = ''
header = {}

def fake(url, **kwargs):
    class Response(object):
        pass
    o = Response()
    o.content = '<a href="/{}/wssd_content.jsp?bookid">foo</a>'.format(url)
    return o

requests.get = fake


def Get_urls(start_page, end_page, queue):
    print('run task {} ({})'.format(start_page, os.getpid()))
    try:
        for i in range(start_page, end_page + 1):
            pageurl = baseurl1 + str(i) + baseurl2 + searchword
            response = requests.get(pageurl, headers=header)
            soup = BeautifulSoup(response.content, 'html.parser')
            a_list = soup.find_all('a')
            for a in a_list:
                if a.text != ''and 'wssd_content.jsp?bookid'in a['href']:
                    text = a.text.strip()
                    url = baseurl + str(a['href'])
                    queue.put(text + '\t' + url + '\n')
    except Exception as e:
        import traceback
        traceback.print_exc()


def write_file(queue):
    print("start write file")
    url_text = codecs.open('url.txt', 'a', 'utf-8')
    while True:
        line = queue.get()
        if line is None:
            break
        print("write {}".format(line))
        url_text.write(line)
    url_text.close()


def Multiple_processes_test():
    t1 = time.time()
    manager = multiprocessing.Manager()
    queue = manager.Queue()
    print 'parent process {} '.format(os.getpid())
    page_ranges_list = [(1, 3), (4, 6), (7, 9)]
    consumer = multiprocessing.Process(target=write_file, args=(queue,))
    consumer.start()
    pool = multiprocessing.Pool(processes=3)
    results = []
    for page_range in page_ranges_list:
        result = pool.apply_async(func=Get_urls,
                         args=(page_range[0],
                               page_range[1],
                               queue
                            ))
        results.append(result)
    pool.close()
    pool.join()
    queue.put(None)
    consumer.join()
    t2 = time.time()
    print '時(shí)間：', t2 - t1


if __name__ == '__main__':
    Multiple_processes_test()

結(jié)果

foo /4/wssd_content.jsp?bookid
foo /5/wssd_content.jsp?bookid
foo /6/wssd_content.jsp?bookid
foo /1/wssd_content.jsp?bookid
f /2/2/片
foo /3/wssd_content.jsp?bookid
foo /7/wssd_content.jsp?bookid
foo /8/wssd_content.jsp?bookid
foo /9/wssd_content.jsp?bookid

點(diǎn)贊 +0

新增回覆

国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

結(jié)果