vortexsf24提出的问题

vortexsf24

Asked: 2024-08-14 02:49:39 +0000 UTC

如何解析使用js加载的元素？

5

我正在尝试解析此页面上每个广告的第一张照片：https://www.otodom.pl/pl/wyniki/wynajem/mieszkanie/cala-polska ?ownerTypeSingleSelect=ALL&viewType=listing&limit=72

然而，事实证明只解析了少数广告的图像，其余的都是使用js加载的。

在这种情况下你可以做什么和尝试？至少告诉我什么可以帮助避免使用 Selenium。

vortexsf24

Asked: 2024-08-06 22:13:46 +0000 UTC

使用 aiohttp 时出现代码 429

5

我需要从 600 多个页面中获得答案，但大多数答案都带有错误 429。我尝试将那些带有 429 的答案发送回任务，但这也没有多大帮助。

async def fetch(session: aiohttp.ClientSession, url: str, params: dict) -> str:
    async with session.get(url, headers=headers, params=params) as response:
        global tasks

        if response.status == 429:
            tasks.append(asyncio.create_task(fetch(session, url, params)))
            return

        response = await response.text()
        return response


async def parse_otodom() -> tuple:
    async with aiohttp.ClientSession() as session:
        html = await fetch(session, URL.get('otodom'), {})
        soup = BeautifulSoup(html, 'lxml')

        page_quantity = int(
            soup.find('ul', class_='e1h66krm4 css-iiviho').find_all('li', class_='css-1lclt1h')[-1].text)


        for page_number in range(1, page_quantity + 1):
            tasks.append(asyncio.create_task(fetch(session, URL.get('otodom'), {'page': page_number})))

        global tasks
        global results_number
        for result in await asyncio.gather(*tasks):
            if result != None:
                results_number+=1


asyncio.run(parse_otodom())
print(results_number)

该代码不返回任何异常。至少告诉我往什么方向思考，可以用什么。谢谢

vortexsf24

Asked: 2023-06-12 23:41:05 +0000 UTC

如何在新进程中运行另一个模块的功能

5

python代码的异步有问题。我们在aiogram库中编写了一个机器人。有必要在其中创建一个进程，其中将无休止地执行来自另一个模块的功能。如何实施？

这是需要在新进程中运行的另一个模块（解析器）的函数代码：

async def main():
    while True:
        tasks = [
            asyncio.create_task(parse_skysports()),
            asyncio.create_task(parse_guardiansport()),
            asyncio.create_task(parse_ign()),
            asyncio.create_task(parse_stopgame()),
            asyncio.create_task(parse_politico()),
            asyncio.create_task(parse_voanews())
        ]

        connection = db.DbConnection()

        for result in await asyncio.gather(*tasks):
            connection.update_data(paper_name=result[0], news=result[1])

        time.sleep(20)

必须通过主模块中的asyncio.run()调用此函数，但以机器人继续运行的方式调用。

这种方法不起作用：

import parser
import asyncio
from aiogram import Bot, Dispatcher, executor, types
import multiprocessing as mp
...

def parsing():
    asyncio.run(parser.main())
...

if __name__ == '__main__':
    executor.start_polling(dp, skip_updates=True)
    process = mp.Process(target=parsing)
    process.start()

在这种情况下，机器人可以工作，但解析器模块的功能不起作用。

预先感谢您的合作。

vortexsf24

Asked: 2023-05-06 17:43:39 +0000 UTC

如何对条件进行更紧凑的表示？

7

这是我的代码：

def is_odd_heavy(arr):
    even_arr = [x for x in arr if x%2==0]
    odd_arr = [x for x in arr if x%2!=0]

    if len(odd_arr)==0:
        return False
    elif len(even_arr)==0:
        return True

    return min(odd_arr)>max(even_arr)

问题：是否可以将中间繁琐的条件换成更紧凑的条件？

vortexsf24

Asked: 2022-05-18 02:33:12 +0000 UTC

AttributeError：“NoneType”对象没有属性“文本”解析

0

我正在尝试从binance解析btc汇率，但由于某种原因，他们给出了错误。这样的标签存在，你可以检查它。

import requests
from bs4 import BeautifulSoup

HOST = "https://www.binance.com"
URL = "https://www.binance.com/ru/trade/BTC_USDT"    

r = requests.get(URL).text
soup = BeautifulSoup(r,"lxml")

course = soup.find("div", class_="showPrice").text

vortexsf24

Asked: 2022-04-23 00:03:51 +0000 UTC

AttributeError：“NoneType”对象没有属性“get”

0

我正在尝试解析来自 img 标签的链接，但它给了我一个错误：

"image":block.find("img", class_="load_image load_done").get("src") AttributeError: 'NoneType' 对象没有属性 'get'

请帮帮我。

for block in blocks:
    cards.append(
    {
        "title":block.find("a").text,
        "link":block.find("a").get("href"),
        "bank":block.find_all("span")[0].text,
        "pay_system":block.find_all("span")[1].text,
        "bet":block.find("div", class_="card-v2__text-accent").text,
        
        "image":block.find("img", class_="load_image load_done").get("src")
        
    }
)

如何解析使用js加载的元素？

使用 aiohttp 时出现代码 429

如何在新进程中运行另一个模块的功能

如何对条件进行更紧凑的表示？

AttributeError：“NoneType”对象没有属性“文本”解析

AttributeError：“NoneType”对象没有属性“get”

我看不懂措辞

请求的模块“del”不提供名为“default”的导出

"!+tab" 在 HTML 的 vs 代码中不起作用

我正在尝试解决“猜词”的问题。Python

可以使用哪些命令将当前指针移动到指定的提交而不更改工作目录中的文件？

Python解析野莓

问题：“警告：检查最新版本的 pip 时出错。”

帮助编写一个用值填充变量的循环。解决这个问题

尽管依赖数组为空，但在渲染上调用了 2 次 useEffect

数据不通过 Telegram.WebApp.sendData 发送

vortexsf24's questions