有一个数据框columns=['author_id', 'author_name', 'book_title', 'price']
需要获取DataFramecolumns=['author_name', 'max_price', 'min_price']
最好通过
groupby('author_name').agg({'price': 'min', 'price': 'max'})
难点就在这里,因为聚合在同一个字段上,+我不能立即设置新字段的名称。如我错了请纠正我。
有一个数据框columns=['author_id', 'author_name', 'book_title', 'price']
需要获取DataFramecolumns=['author_name', 'max_price', 'min_price']
最好通过
groupby('author_name').agg({'price': 'min', 'price': 'max'})
难点就在这里,因为聚合在同一个字段上,+我不能立即设置新字段的名称。如我错了请纠正我。
实际上代码:
def app(environ, start_response):
start_response('200 OK', [('Content-type', 'text/html')])
with codecs.open("template.html", 'r', 'utf8') as template_file:
template_content = template_file.read()
return template_content
和一个空的服务器响应
GET / => generated 0 bytes in 1 msecs (HTTP/1.1 200) 1 headers in 44 bytes (1105 switches on core 0)
该文档为空。如果将 template_content 替换为 u"Hi" - 同样的事情。“你好”是正常的回答。
return str(template_content) 抛出编码错误
UnicodeEncodeError: 'ascii' codec can't encode characters in position 82-86: ordinal not in range(128)
检查 uwsgi 和 wsgiref.simple_server
问题在于对数据库的查询需要极长的时间(最多 400 ms)
SELECT * FROM production_category AS cc
INNER JOIN LATERAL (
SELECT cc.id AS id, SUM(products) AS p_count FROM (
SELECT cc.id AS parent_id, category_id, COUNT(id) AS products FROM production_product_categories WHERE category_id IN (
SELECT id FROM production_category WHERE lft <= cc.rght AND lft >= cc.lft AND tree_id = cc.tree_id)
AND product_id IN ( SELECT id FROM production_product WHERE manufacturer_id = 15 )
GROUP BY category_id
) AS sub_cc
GROUP BY parent_id ) AS cp USING(id)
WHERE cp.p_count > 0;
结构如下:
有一个制造商(manufacturer),他的产品(production_product)和任何产品都可以属于一个或多个类别(production_category)。也可能有类别中没有该制造商的产品。
任务是选择包含该制造商产品的类别。
所以问题出现了:要么我对查询太聪明了,要么应该对数据库进行非规范化并将制造商的类别信息存储在单独的表中(确切关系(制造商 ID,类别 ID))。
PostgreSQL (9.3.4)
Nested Loop (cost=207.57..129375.37 rows=42 width=223)
-> Seq Scan on production_category cc (cost=0.00..25.20 rows=620 width=191)
-> Subquery Scan on cp (cost=207.57..208.62 rows=1 width=36)
Filter: (cc.id = cp.id)
-> HashAggregate (cost=207.57..208.09 rows=42 width=12)
Filter: (sum((count(production_product_categories.id))) > 0::numeric)
-> HashAggregate (cost=206.41..206.83 rows=42 width=8)
-> Nested Loop Semi Join (cost=9.23..206.20 rows=42 width=8)
-> Nested Loop (cost=8.94..106.81 rows=45 width=12)
-> Bitmap Heap Scan on production_category (cost=4.31..12.78 rows=1 width=4)
Recheck Cond: ((lft <= cc.rght) AND (lft >= cc.lft))
Filter: (tree_id = cc.tree_id)
-> Bitmap Index Scan on production_category_caf7cc51 (cost=0.00..4.31 rows=3 width=0)
Index Cond: ((lft <= cc.rght) AND (lft >= cc.lft))
-> Bitmap Heap Scan on production_product_categories (cost=4.64..93.58 rows=45 width=12)
Recheck Cond: (category_id = production_category.id)
-> Bitmap Index Scan on production_product_categories_b583a629 (cost=0.00..4.62 rows=45 width=0)
Index Cond: (category_id = production_category.id)
-> Index Scan using production_product_pkey on production_product (cost=0.29..2.20 rows=1 width=4)
Index Cond: (id = production_product_categories.product_id)
Filter: (manufacturer_id = 15)
PostgreSQL 数据库
CREATE TABLE items (
id SERIAL,
user_id INTEGER,
status SMALLINT
);
CREATE TABLE phones (
phone VARCHAR(12),
users INTEGER[]
);
其中 items.status 为 3、5 或 7
phone1, phone2... 列表需要选择下面的表格
phone1;COUNT(items.id) where status=3 ;COUNT(items.id) where status=7 phone2;COUNT(items.id) where status=3 ;COUNT(items.id) where status=7 ...
我的版本乘以重复然后总结
SELECT phones.phone as phone, COUNT(it1.id) as saled_count, COUNT(it2.id) as not_saled_count
FROM phones
INNER JOIN items it1 ON phones.users @> ARRAY[it1.user_id]::int[]
INNER JOIN items it2 ON phones.users @> ARRAY[it2.user_id]::int[]
WHERE phones.phone IN ('phone1', 'phone2')
AND it1.status = 7 AND it2.status = 3
GROUP BY phones.phone;
结果分别
phone1 5 5
phone2 6 6
...
代替
phone1 3 2
phone2 5 1
...
那么,问题本身:如何提出请求?