今天学习xpath，相对比较简单（也有可能是因为任务简单），直接开始操作。

Xpath简单介绍

XPath 是一门在 XML 文档中查找信息的语言。XPath 可用来在 XML 文档中对元素和属性进行遍历。 XPath 是 W3C XSLT 标准的主要元素，并且 XQuery 和 XPointer 都构建于 XPath 表达之上。其官方教程为http://www.w3school.com.cn/xpath/index.asp. 我认为最重要的知识如下：

任务实现

# 前面为基本步骤
import requests
from lxml import etree

url = 'http://www.dxy.cn/bbs/thread/626626#626626'
headers = {
    'User-Agent':"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50"
}
res = requests.get(url, headers=headers)
res.encoding = res.apparent_encoding
html = etree.HTML(res.text)

comments = html.xpath('//td[@class="postbody"]/text()')
names = html.xpath('//div[@class="auth"]/a/text()')
for i in range(len(comments)):
    comments[i] = comments[i].replace("\n","").replace("\t","").replace(" ","")
for con, name in zip(comments, names):
    print("名字：{}".format(name), " 评论：{}".format(con))
    print('\n')

最终显示结果如下：

貌似不太对啊！！！没时间弄了，再说。