layer:11

an addiction to or an obsession with acquiring, manipulating, and sharing information

String Aggregation

Say you have the following data and you want to concatenate all queries for each year.

 1 CREATE TABLE years (year smallint);
 2 INSERT INTO years VALUES
 3   (2008),
 4   (2007),
 5   (2009);
 6 
 7 CREATE TABLE top_queries (year smallint, place smallint, query varchar(50));
 8 INSERT INTO top_queries VALUES
 9   (2009,  4, 'twitter'),
10   (2008,  2, 'beijing 2008'),
11   (2008,  9, 'euro 2008'),
12   (2007,  8, 'second life'),
13   (2008,  8, 'wer kennt wen'),
14   (2009,  6, 'new moon'),
15   (2007,  3, 'facebook'),
16   (2007,  4, 'dailymotion'),
17   (2009,  8, 'windows 7'),
18   (2007,  7, 'ebuddy'),
19   (2009,  9, 'dantri.com.vn'),
20   (2009, 10, 'torpedo gratis'),
21   (2009,  3, 'tuenti'),
22   (2008,  1, 'sarah palin'),
23   (2008,  6, 'obama'),
24   (2009,  1, 'michael jackson'),
25   (2007,  1, 'iphone'),
26   (2009,  5, 'sanalika'),
27   (2007, 10, 'club penguin'),
28   (2007,  9, 'hi5'),
29   (2009,  2, 'facebook'),
30   (2008,  4, 'tuenti'),
31   (2007,  6, 'youtube'),
32   (2008,  7, 'nasza klasa'),
33   (2008,  3, 'facebook login'),
34   (2009,  7, 'lady gaga'),
35   (2007,  5, 'webkinz'),
36   (2007,  2, 'badoo'),
37   (2008,  5, 'heath ledger'),
38   (2008, 10, 'jonas brothers');

For SQL Server, a recursive CTE is good option:

 1 WITH
 2   q AS (
 3     SELECT
 4       ROW_NUMBER() OVER(PARTITION BY year ORDER BY place) AS nr,
 5       year,
 6       query
 7     FROM top_queries
 8   ),
 9   t AS (
10     SELECT
11       year,
12       1 AS query_nr,
13       CONVERT(varchar(8000), '') AS queries
14     FROM years
15     UNION ALL
16     SELECT
17       t.year,
18       t.query_nr + 1,
19       CONVERT(varchar(8000),
20         t.queries +
21         CASE WHEN LEN(t.queries) > 0 THEN ', ' ELSE '' END +
22         q.query
23       )
24     FROM t
25     INNER JOIN q ON
26       q.year = t.year AND
27       q.nr = t.query_nr
28   )
29 SELECT
30   year,
31   queries
32 FROM t
33 WHERE
34   query_nr IN (
35     SELECT MAX(nr) + 1 FROM q GROUP BY year
36   )
37 ORDER BY year;

Or maybe, although I never understood how exactly this works, FOR XML clause:

 1 SELECT
 2   year,
 3   LEFT(queries, LEN(queries) - 2) AS queries
 4 FROM (
 5   SELECT
 6     year,
 7     (
 8       SELECT q.query + ', ' AS [text()]
 9       FROM top_queries AS q
10       WHERE q.year = y.year
11       ORDER BY q.place
12       FOR XML PATH('')
13     ) AS queries
14   FROM years AS y
15   GROUP BY year
16 ) AS t
17 ORDER BY year;

In PostgreSQL, pretty much the same CTE query would work, but you’re better off with array_agg function:

1 SELECT
2   y.year,
3   array_to_string(array_agg(query), ', ') AS queries
4 FROM (
5   SELECT * FROM top_queries ORDER BY place DESC
6 ) AS q
7 INNER JOIN years AS y ON y.year = q.year
8 GROUP BY y.year
9 ORDER BY y.year;

Generating a Range of Values

For example, lets generate a set of months, from 2010-01-01 to 2011-01-01.

In SQL Server:

1 WITH t AS (
2   SELECT CONVERT(date, '2010-01-01') AS d
3   UNION ALL
4   SELECT DATEADD(MONTH, 1, d) FROM t WHERE d < '2010-12-01'
5 )
6 SELECT * FROM t ORDER BY d;

Same in PostgreSQL:

1 WITH RECURSIVE t AS (
2   SELECT '2010-01-01'::timestamp AS d
3   UNION ALL
4   SELECT d + '1 months' FROM t WHERE d < '2010-12-01'
5 )
6 SELECT * FROM t ORDER BY d

In PostgreSQL, you can also use generate_series() function, but it’s useful only in these basic cases:

1 SELECT * FROM generate_series('2010-01-01'::timestamp, '2010-12-01', '1 months');
| Archive | RSS | E-mail | Twitter | Alvis Mikovs