python——我如何根据新闻的标题获取新闻的内容?
发布时间:2022-03-31 13:31:11 294
相关标签: # node.js
我有一个列表框,其中标题和新闻时间是从2个链接中提取出来的,点击“编辑”后打印在列表框中;“查看标题”;按钮这是正确的。一切都好!
现在我想从列表框中选择报纸标题,点击;“查看内容”;按钮,并在多行文本框中查看新闻内容。所以我想在下面的文本框中查看所选标题的新闻内容。我指定标题与新闻内容的链接相同.但我对构建这个的功能有一个问题:
def content():
if title.select:
#click on title-link
driver.find_element_by_tag_name("title").click()
#Download Content to class for every title
content_download =(" ".join([span.text for span in div.select("text mbottom")]))
#Print Content in textobox
textbox_download.insert(tk.END, content_download)
所以我想象,要得到这个,我们必须模拟点击新闻标题来打开它(在html中是这样的)title
),然后选择内容的文本(在html中是text mbottom
)然后把它复制到我文件的tetbox里。应该是这样吗?你在说什么?显然,我的代码写得很糟糕,而且不起作用。我不太擅长刮。有人能帮我吗?非常感谢。
完整的代码是这样的(可以正确执行,现在可以删除标题。我不调用按钮中的内容函数)。除了上述功能外,代码运行良好,可以获取标题和新闻时间
from tkinter import *
from tkinter import ttk
import tkinter as tk
import sqlite3
import random
import tkinter.font as tkFont
from tkinter import ttk
window=Tk()
window.title("x")
window.geometry("800x800")
textbox_title = tk.Listbox(window, width=80, height=16, font=('helvetic', 12), selectbackground="#960000", selectforeground="white", bg="white") #prima era self.tutti_pronostici, per far visualizzare le chiamate dall'altra finestra
textbox_title.place(x=1, y=1)
textbox_download = tk.Listbox(window, width=80, height=15, font=('helvetic', 12), selectbackground="#960000", selectforeground="white", bg="white") #prima era self.tutti_pronostici, per far visualizzare le chiamate dall'altra finestra
textbox_download.place(x=1, y=340)
#Download All Titles and Time
def all_titles():
allnews = []
import requests
from bs4 import BeautifulSoup
# mock browser request
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
}
#ATALANTA
site_atalanta = requests.get('https://www.tuttomercatoweb.com/atalanta/', headers=headers)
soup = BeautifulSoup(site_atalanta.content, 'html.parser')
news = soup.find_all('div', attrs={"class": "tcc-list-news"})
for each in news:
for div in each.find_all("div"):
time= (div.find('span', attrs={'class': 'hh serif'}).text)
title=(" ".join([span.text for span in div.select("a > span")]))
news = (f" {time} {'ATALANTA'}, {title} (TMW)")
allnews.append(news)
#BOLOGNA
site_bologna = requests.get('https://www.tuttomercatoweb.com/bologna/', headers=headers)
soup = BeautifulSoup(site_bologna.content, 'html.parser')
news = soup.find_all('div', attrs={"class": "tcc-list-news"})
for each in news:
for div in each.find_all("div"):
time= (div.find('span', attrs={'class': 'hh serif'}).text)
title=(" ".join([span.text for span in div.select("a > span")]))
news = (f" {time} {'BOLOGNA'}, {title} (TMW)")
allnews.append(news)
allnews.sort(reverse=True)
for news in allnews:
textbox_title.insert(tk.END, news)
#Download Content of News
def content():
if titolo.select:
#click on title-link
driver.find_element_by_tag_name("title").click()
#Download Content to class for every title
content_download =(" ".join([span.text for span in div.select("text mbottom")]))
#Print Content in textobox
textbox_download.insert(tk.END, content_download)
button = tk.Button(window, text="View Titles", command= lambda: [all_titles()])
button.place(x=1, y=680)
button2 = tk.Button(window, text="View Content", command= lambda: [content()])
button2.place(x=150, y=680)
window.mainloop()
特别声明:以上内容(图片及文字)均为互联网收集或者用户上传发布,本站仅提供信息存储服务!如有侵权或有涉及法律问题请联系我们。
举报