noodles 19f1b843af md_edit | 2 years ago | |
---|---|---|
result | 3 years ago | |
utility | 2 years ago | |
HKS須重爬店家.csv | 2 years ago | |
README.md | 2 years ago | |
category.csv | 3 years ago | |
details.xls | 2 years ago | |
get_google_id.py | 2 years ago | |
hot_pot.xls | 2 years ago | |
jared_pureselenium_shop_item_list.py | 3 years ago | |
jared_run.py | 2 years ago | |
jared_shop_item_list.py | 2 years ago | |
lat_long_location.csv | 3 years ago | |
lat_long_search.py | 3 years ago | |
linux_loop.sh | 2 years ago | |
location_list.csv | 3 years ago | |
loop_5555.bat | 2 years ago | |
loop_6666.bat | 2 years ago | |
loop_rep.py | 2 years ago | |
loop_storelist.bat | 2 years ago | |
run.py | 2 years ago | |
run2.py | 3 years ago | |
run3.py | 2 years ago | |
run4.py | 2 years ago | |
run5.py | 2 years ago | |
shop_item_list.py | 2 years ago | |
start.sh | 3 years ago | |
swire_docker_itemlist.py | 2 years ago | |
swire_shop_item_list.py | 2 years ago | |
swire_shop_review.py | 2 years ago |
Google Poi Crawler
Code : swire_shop_review.py
Execute:
python swire_shop_review.py [port] [proxyport]
swire_store_list
(line 103): store shop list
reviews_table
(line 232): store crawler result, use Function save_js_to_db
to save parsing data into database
db_columns = ['author_id','author_page','author_name', 'author_image','author_review_count','review_time','review_content','review_image','store_review_time','store_review']
review_process
(line 271): store cralwer status
Function get_next_job
(line 98): get shop list, data column need ==shop url== (code called item_url), ==fid==, ==shop rating counts== (code called user_ratings_total)
Use ==fid== as all db key
Function get_reviews
(line 205): Check if the store has reviews first and use shop rating countsdivided by three as page down counts.