PostgreSQL的window函数整理

PG在8.4以后版本中添加了一些Window Function功能，下面简单介绍

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.

Window Functions in SQL is an OLAP functionality that provides ranking, cumulative computation, and partitioning aggregation. Many commercial RDMBS such like Oracle, MS SQL Server and DB2 have implemented part of this specification, while open source RDMBS including PostgreSQL, MySQL and Firebird doesn't yet. To implement this functionality on PostgreSQL not only helps many users move from those RDBMS to PostgreSQL but encourages OLAP applications such as BI (Business Inteligence) to analyze large data set. This specification is defined first in SQL:2003, and improved in SQL:2008 www.zzzyk.com

简言之，聚合函数返回的各个分组的结果，窗口函数则为每一行返回结果,示例如下：

1.建示例表,初始化数据

DROP TABLE IF EXISTS empsalary;

CREATE TABLE empsalary(

depname varchar,

empno bigint,

salary int,

enroll_date date

); www.zzzyk.com

INSERT INTO empsalary VALUES('develop',10, 5200, '2007/08/01');

INSERT INTO empsalary VALUES('sales', 1, 5000, '2006/10/01');

INSERT INTO empsalary VALUES('personnel', 5, 3500, '2007/12/10');

INSERT INTO empsalary VALUES('sales', 4, 4800, '2007/08/08');

INSERT INTO empsalary VALUES('sales', 6, 5500, '2007/01/02');

INSERT INTO empsalary VALUES('personnel', 2, 3900, '2006/12/23');

INSERT INTO empsalary VALUES('develop', 7, 4200, '2008/01/01');

INSERT INTO empsalary VALUES('develop', 9, 4500, '2008/01/01');

INSERT INTO empsalary VALUES('sales', 3, 4800, '2007/08/01');

INSERT INTO empsalary VALUES('develop', 8, 6000, '2006/10/01');

INSERT INTO empsalary VALUES('develop', 11, 5200, '2007/08/15');

postgres=# select * from empsalary ;

depname | empno | salary | enroll_date

-----------+-------+--------+-------------

develop | 10 | 5200 | 2007-08-01

sales | 1 | 5000 | 2006-10-01

personnel | 5 | 3500 | 2007-12-10

sales | 4 | 4800 | 2007-08-08

sales | 6 | 5500 | 2007-01-02

personnel | 2 | 3900 | 2006-12-23

develop | 7 | 4200 | 2008-01-01

develop | 9 | 4500 | 2008-01-01

sales | 3 | 4800 | 2007-08-01

develop | 8 | 6000 | 2006-10-01

develop | 11 | 5200 | 2007-08-15

(11 rows) www.zzzyk.com

2.统计示例

a.统计各部门的总薪水，平均薪水和部门的详细情况

postgres=# select sum(salary) OVER (PARTITION BY depname),avg(salary) OVER (PARTITION BY depname),* from empsalary;

-------+-----------------------+-----------+-------+--------+-------------

25100 | 5020.0000000000000000 | develop | 10 | 5200 | 2007-08-01

25100 | 5020.0000000000000000 | develop | 7 | 4200 | 2008-01-01

25100 | 5020.0000000000000000 | develop | 9 | 4500 | 2008-01-01

25100 | 5020.0000000000000000 | develop | 8 | 6000 | 2006-10-01

25100 | 5020.0000000000000000 | develop | 11 | 5200 | 2007-08-15

7400 | 3700.0000000000000000 | personnel | 2 | 3900 | 2006-12-23

7400 | 3700.0000000000000000 | personnel | 5 | 3500 | 2007-12-10

20100 | 5025.0000000000000000 | sales | 3 | 4800 | 2007-08-01

20100 | 5025.0000000000000000 | sales | 1 | 5000 | 2006-10-01

20100 | 5025.0000000000000000 | sales | 4 | 4800 | 2007-08-08

20100 | 5025.0000000000000000 | sales | 6 | 5500 | 2007-01-02

(11 rows) www.zzzyk.com

b.统计人员在所在部门的薪水排名情况

postgres=# select rank() OVER (PARTITION BY depname ORDER BY salary),* from empsalary;

rank | depname | empno | salary | enroll_date

------+-----------+-------+--------+-------------

1 | develop | 7 | 4200 | 2008-01-01

2 | develop | 9 | 4500 | 2008-01-01

3 | develop | 10 | 5200 | 2007-08-01

3 | develop | 11 | 5200 | 2007-08-15

5 | develop | 8 | 6000 | 2006-10-01

1 | personnel | 5 | 3500 | 2007-12-10

2 | personnel | 2 | 3900 | 2006-12-23

1 | sales | 4 | 4800 | 2007-08-08

1 | sales | 3 | 4800 | 2007-08-01

3 | sales | 1 | 5000 | 2006-10-01

4 | sales | 6 | 5500 | 2007-01-02

(11 rows) www.zzzyk.com

3.一个有趣的例子注意使用order by,结果会两样

create table foo(a int,b int) ;

insert into foo values (1,1);

insert into foo values (2,1);

insert into foo values (4,1);

insert into foo values (2,1);

insert into foo values (4,1);

insert into foo values (5,1);

insert into foo values (11,3);

insert into foo values (12,3);

insert into foo values (22,3);

insert into foo values (16,3);

postgres=# select sum(a) over (partition by b), a, b from foo;

sum | a | b

-----+----+---

19 | 1 | 1

19 | 2 | 1

19 | 4 | 1

19 | 2 | 1

19 | 4 | 1

19 | 5 | 1

93 | 11 | 3

上一个：ROW_NUMBER()添加索引实现真分页
下一个：MongoDB学习笔记(聚合)

更多图片编程知识:

更多SQLServer疑问解答：: 配置MSSQL复制指定快照文件夹提示：不是有效的路径或文件名; 详细解读varchar和Nvarchar区别; SQL SERVER 2005 同步复制技术; 进程未能大容量复制到表解决方法; MSSql实例教程:MSSql数据库同步; SQLServer2000同步复制技术实现（分发和订阅）; sqlserver; SQLSERVER; 测试 sqlserver 最大用户数连接; 写出一条Sql语句：取出表A中第31到第40记录（SQLServer,以自动增长的ID作为主键,注意：ID可能不是连续的; sqlserver2008的安装问题。; 为什么安装的SQL server 2008我的没有MSSQLSERVER协议啊？; 怎样把exel表里的数据复制到sqlserver表里; sqlserver数据库主键和外键问题; sqlserver 中的左表连接查询和右表连接查询有啥不同？有什么用？