Coding Bash vs. Perl vs. Python (by Evan Liu)


Topics about threading

Threading: access to a list of webpages with multi-threading

Bash does not support multi threading, however multi-processes can achieve the similar functionality.
In python, there are may ways to support mult threading. If you use class:
1. create a class that inherit threading
2. do initialization in def__init__()
3. do your job in def run()

Tasks:

		  
Bash Perl Python
  1. site_str='bing.com|google.com|www.cnn.com|www.voa.com'
  2. #site_str='baidu.com|163.com|qq.com|fudan.com.cn|sohu.com|sina.com|weibo.com|taobao.com|jd.com'.split('|')
  3. IFS='|' read -a sites <<<"$site_str"
  4. unset IFS
  5. t1=`date +%s`
  6. for(( i=0; i< ${#sites[@]}; i++)); do
  7. (
  8. echo "${sites[i]} is starting";
  9. t1=`date +%s`;
  10. text=$(wget ${sites[i]} -O - 2>/dev/null);
  11. t2=`date +%s`;
  12. timearr[i]=$((t2-t1));
  13. printf "%-15s --> %-15s seconds\n" "${sites[i]}" "${timearr[i]}"
  14. ) &
  15. done; wait
  16. t2=`date +%s`
  17. echo "actually time used: $((t2-t1))"
  18. -------output----------
  1. #site_str='baidu.com|163.com|qq.com|fudan.com.cn|sohu.com|sina.com|weibo.com|taobao.com|jd.com'.split('|')
  2. www.voa.com is starting
  3. www.cnn.com is starting
  4. google.com is starting
  5. bing.com is starting
  6. www.cnn.com --> 9 seconds
  7. www.voa.com --> 12 seconds
  8. google.com --> 16 seconds
  9. bing.com --> 23 seconds
  10. actually time used: 23
  1. #todo
  2. -------output----------
  1. #todo
  1. import threading
  2. import re, urllib
  3. import time
  4. class WebpageThread(threading.Thread):
  5. def __init__(self, site):
  6. super(WebpageThread,self).__init__()
  7. self.site=site
  8. self.total=0
  9. def run(self):
  10. t1=time.time()
  11. #print "--> accessing to %s ..." % self.site
  12. u=urllib.urlopen("http://"+ self.site)
  13. text = u.read()
  14. #print "--> done with %s " % self.site
  15. self.total = time.time() - t1
  16. sites = 'bing.com|google.com|yahoo.com|www.cnn.com|www.voa.com|www.youtube.com'.split('|')
  17. #sites = 'baidu.com|163.com|qq.com|fudan.com.cn|sohu.com|sina.com|weibo.com|taobao.com|jd.com'.split('|')
  18. threadlst = [];
  19. t1=time.time()
  20. for i in xrange(0,len(sites)):
  21. threadlst.append(WebpageThread(sites[i]))
  22. for i in xrange(0,len(sites)):
  23. threadlst[i].start()
  24. for i in xrange(0,len(sites)):
  25. threadlst[i].join()
  26. total=0
  27. for i in xrange(0,len(sites)):
  28. print "%-15s --> %-15s seconds" % (threadlst[i].site, threadlst[i].total)
  29. total += threadlst[i].total
  30. actual = time.time() - t1
  31. print "total time used: %d" % total
  32. print "actual time used: %d" % actual
  33. -------output----------
  1. #sites = 'baidu.com|163.com|qq.com|fudan.com.cn|sohu.com|sina.com|weibo.com|taobao.com|jd.com'.split('|')
  2. bing.com --> 17.1824159622 seconds
  3. google.com --> 14.7786750793 seconds
  4. yahoo.com --> 14.7702310085 seconds
  5. www.cnn.com --> 14.7730720043 seconds
  6. www.voa.com --> 14.7729201317 seconds
  7. www.youtube.com --> 17.1578299999 seconds
  8. total time used: 93
  9. actual time used: 17