Kamo Petrosyan提出的问题

Kamo Petrosyan

Asked: 2020-10-26 20:59:08 +0000 UTC

pandas DataFrames 聚合最小最大均值

2

有一个数据框columns=['author_id', 'author_name', 'book_title', 'price']

需要获取DataFramecolumns=['author_name', 'max_price', 'min_price']

最好通过

groupby('author_name').agg({'price': 'min', 'price': 'max'})

难点就在这里，因为聚合在同一个字段上，+我不能立即设置新字段的名称。如我错了请纠正我。

Kamo Petrosyan

Asked: 2020-03-17 15:14:46 +0000 UTC

UTF 到 Python WSGI

0

实际上代码：

def app(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/html')])
    with codecs.open("template.html", 'r', 'utf8') as template_file:
        template_content = template_file.read()
    return template_content

和一个空的服务器响应

GET / => generated 0 bytes in 1 msecs (HTTP/1.1 200) 1 headers in 44 bytes (1105 switches on core 0)

该文档为空。如果将 template_content 替换为 u"Hi" - 同样的事情。“你好”是正常的回答。

return str(template_content) 抛出编码错误

UnicodeEncodeError: 'ascii' codec can't encode characters in position 82-86: ordinal not in range(128)

检查 uwsgi 和 wsgiref.simple_server

Kamo Petrosyan

Asked: 2020-03-06 19:05:40 +0000 UTC

SQL 查询优化或反规范化？

0

问题在于对数据库的查询需要极长的时间（最多 400 ms）

    SELECT * FROM production_category AS cc
        INNER JOIN LATERAL (
            SELECT cc.id AS id, SUM(products) AS p_count FROM (
                SELECT cc.id AS parent_id, category_id, COUNT(id) AS products FROM production_product_categories WHERE category_id IN (
                    SELECT id FROM production_category WHERE lft <= cc.rght AND lft >= cc.lft AND tree_id = cc.tree_id)
                    AND product_id IN ( SELECT id FROM production_product WHERE manufacturer_id = 15 )
                    GROUP BY category_id
                ) AS sub_cc
            GROUP BY parent_id ) AS cp USING(id)
    WHERE cp.p_count > 0;

结构如下：

有一个制造商（manufacturer），他的产品（production_product）和任何产品都可以属于一个或多个类别（production_category）。也可能有类别中没有该制造商的产品。

任务是选择包含该制造商产品的类别。

所以问题出现了：要么我对查询太聪明了，要么应该对数据库进行非规范化并将制造商的类别信息存储在单独的表中（确切关系（制造商 ID，类别 ID））。

PostgreSQL (9.3.4)

     Nested Loop  (cost=207.57..129375.37 rows=42 width=223)
       ->  Seq Scan on production_category cc  (cost=0.00..25.20 rows=620 width=191)
       ->  Subquery Scan on cp  (cost=207.57..208.62 rows=1 width=36)
             Filter: (cc.id = cp.id)
             ->  HashAggregate  (cost=207.57..208.09 rows=42 width=12)
                   Filter: (sum((count(production_product_categories.id))) > 0::numeric)
                   ->  HashAggregate  (cost=206.41..206.83 rows=42 width=8)
                         ->  Nested Loop Semi Join  (cost=9.23..206.20 rows=42 width=8)
                               ->  Nested Loop  (cost=8.94..106.81 rows=45 width=12)
                                     ->  Bitmap Heap Scan on production_category  (cost=4.31..12.78 rows=1 width=4)
                                           Recheck Cond: ((lft <= cc.rght) AND (lft >= cc.lft))
                                           Filter: (tree_id = cc.tree_id)
                                           ->  Bitmap Index Scan on production_category_caf7cc51  (cost=0.00..4.31 rows=3 width=0)
                                                 Index Cond: ((lft <= cc.rght) AND (lft >= cc.lft))
                                     ->  Bitmap Heap Scan on production_product_categories  (cost=4.64..93.58 rows=45 width=12)
                                           Recheck Cond: (category_id = production_category.id)
                                           ->  Bitmap Index Scan on production_product_categories_b583a629  (cost=0.00..4.62 rows=45 width=0)
                                                 Index Cond: (category_id = production_category.id)
                               ->  Index Scan using production_product_pkey on production_product  (cost=0.29..2.20 rows=1 width=4)
                                     Index Cond: (id = production_product_categories.product_id)
                                     Filter: (manufacturer_id = 15)

Kamo Petrosyan

Asked: 2020-12-10 13:31:44 +0000 UTC

摆脱重复的 INNER JOIN

0

PostgreSQL 数据库

CREATE TABLE items (
    id SERIAL,
    user_id INTEGER,
    status SMALLINT
);

CREATE TABLE phones (
  phone VARCHAR(12),
  users INTEGER[]
);

其中 items.status 为 3、5 或 7

phone1, phone2... 列表需要选择下面的表格

phone1;COUNT(items.id) where status=3 ;COUNT(items.id) where status=7 phone2;COUNT(items.id) where status=3 ;COUNT(items.id) where status=7 ...

我的版本乘以重复然后总结

SELECT phones.phone as phone, COUNT(it1.id) as saled_count, COUNT(it2.id) as not_saled_count
FROM phones
INNER JOIN items it1 ON phones.users @> ARRAY[it1.user_id]::int[]
INNER JOIN items it2 ON phones.users @> ARRAY[it2.user_id]::int[]
WHERE phones.phone IN ('phone1', 'phone2')
    AND it1.status = 7 AND it2.status = 3
GROUP BY phones.phone;

结果分别

phone1    5    5
phone2    6    6
...

代替

phone1    3    2
phone2    5    1
...

那么，问题本身：如何提出请求？

pandas DataFrames 聚合最小最大均值

UTF 到 Python WSGI

SQL 查询优化或反规范化？

摆脱重复的 INNER JOIN

我看不懂措辞

请求的模块“del”不提供名为“default”的导出

"!+tab" 在 HTML 的 vs 代码中不起作用

我正在尝试解决“猜词”的问题。Python

可以使用哪些命令将当前指针移动到指定的提交而不更改工作目录中的文件？

Python解析野莓

问题：“警告：检查最新版本的 pip 时出错。”

帮助编写一个用值填充变量的循环。解决这个问题

尽管依赖数组为空，但在渲染上调用了 2 次 useEffect

数据不通过 Telegram.WebApp.sendData 发送

Kamo Petrosyan's questions