最近 MySQL で SQL の相談を受けたときに「もしかしたら GROUP_CONCAT
が使えるかも?」というアドバイスをした.GROUP BY
だけではなく GROUP_CONCAT
も知っておくと便利なので,今回は GROUP_CONCAT
の紹介記事を書く.
GROUP_CONCAT
とは?
GROUP_CONCAT
を使うと GROUP BY
で集約をしたレコードのデータを「連結した文字列として」返すことができる.
検証用データセット
今回は MySQL 5.7 で,MySQL から公式に公開されているデータセット「world database」を使って検証をする.
既に「world database」を含めた Docker イメージを公開しているので,今回は kakakakakku/mysql-world-database:5.7
を使う.
$ docker pull kakakakakku/mysql-world-database:5.7 $ docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d kakakakakku/mysql-world-database:5.7 $ docker exec -it $(docker container ls | grep 'kakakakakku/mysql-world-database' | awk '{print $1}') /bin/sh
詳しくは以下の記事にまとめてある.
GROUP_CONCAT
を試す
まず「日本の都市名」を都道府県ごとに集計し,連結した文字列として返す SQL を書く.
SELECT District, COUNT(Name), GROUP_CONCAT(Name) FROM city WHERE CountryCode = 'JPN' GROUP BY District ORDER BY COUNT(Name) DESC;
実行すると,以下のように「日本の都市名」をカンマ区切りで取得できる.データセットの問題で,予想以上に「日本の都市名」が少ないこともわかる.
mysql> SELECT District, COUNT(Name), GROUP_CONCAT(Name) -> FROM city -> WHERE CountryCode = 'JPN' -> GROUP BY District -> ORDER BY COUNT(Name) DESC; +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | District | COUNT(Name) | GROUP_CONCAT(Name) | +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Osaka | 22 | Osaka,Sakai,Higashiosaka,Hirakata,Toyonaka,Takatsuki,Suita,Yao,Ibaraki,Neyagawa,Kishiwada,Izumi,Moriguchi,Kadoma,Matsubara,Daito,Minoo,Tondabayashi,Kawachinagano,Habikino,Ikeda,Izumisano | | Saitama | 21 | Urawa,Kawaguchi,Omiya,Kawagoe,Tokorozawa,Koshigaya,Soka,Ageo,Kasukabe,Sayama,Kumagaya,Niiza,Iruma,Misato,Asaka,Iwatsuki,Toda,Fukaya,Sakado,Fujimi,Higashimatsuyama | | Chiba | 19 | Chiba,Funabashi,Matsudo,Ichikawa,Kashiwa,Ichihara,Sakura,Yachiyo,Narashino,Nagareyama,Urayasu,Abiko,Kisarazu,Noda,Kamagaya,Nishio,Kimitsu,Mobara,Narita | | Tokyo-to | 18 | Tokyo,Hachioji,Machida,Fuchu,Chofu,Kodaira,Mitaka,Hino,Tachikawa,Hitachinaka,Ome,Higashimurayama,Musashino,Higashikurume,Koganei,Kokubunji,Akishima,Hoya | | Aichi | 15 | Nagoya,Toyohashi,Toyota,Okazaki,Kasugai,Ichinomiya,Anjo,Komaki,Seto,Kariya,Toyokawa,Handa,Tokai,Inazawa,Konan | | Kanagawa | 15 | Jokohama [Yokohama],Kawasaki,Sagamihara,Yokosuka,Fujisawa,Hiratsuka,Chigasaki,Atsugi,Yamato,Odawara,Kamakura,Hadano,Zama,Ebina,Isehara | | Hyogo | 11 | Kobe,Amagasaki,Himeji,Nishinomiya,Akashi,Kakogawa,Takarazuka,Itami,Kawanishi,Sanda,Takasago | | Hokkaido | 10 | Sapporo,Asahikawa,Hakodate,Kushiro,Obihiro,Tomakomai,Otaru,Ebetsu,Kitami,Muroran | | Shizuoka | 9 | Hamamatsu,Shizuoka,Shimizu,Fuji,Numazu,Fujieda,Fujinomiya,Yaizu,Mishima | | Mie | 6 | Yokkaichi,Suzuka,Tsu,Matsusaka,Kuwana,Ise | | Yamaguchi | 6 | Shimonoseki,Ube,Yamaguchi,Hofu,Tokuyama,Iwakuni | | Fukuoka | 5 | Fukuoka,Kitakyushu,Kurume,Omuta,Kasuga | | Gumma | 5 | Maebashi,Takasaki,Ota,Isesaki,Kiryu | | Hiroshima | 5 | Hiroshima,Fukuyama,Kure,Higashihiroshima,Onomichi | | Ibaragi | 5 | Mito,Hitachi,Tsukuba,Tama,Tsuchiura | | Fukushima | 4 | Iwaki,Koriyama,Fukushima,Aizuwakamatsu | | Gifu | 4 | Gifu,Ogaki,Kakamigahara,Tajimi | | Kyoto | 4 | Kioto,Uji,Maizuru,Kameoka | | Nagano | 4 | Nagano,Matsumoto,Ueda,Iida | | Nara | 4 | Nara,Kashihara,Ikoma,Yamatokoriyama | | Niigata | 4 | Niigata,Nagaoka,Joetsu,Kashiwazaki | | Tochigi | 4 | Utsunomiya,Ashikaga,Oyama,Kanuma | | Yamagata | 4 | Yamagata,Sakata,Tsuruoka,Yonezawa | | Aomori | 3 | Aomori,Hachinohe,Hirosaki | | Ehime | 3 | Matsuyama,Niihama,Imabari | | Miyazaki | 3 | Miyazaki,Miyakonojo,Nobeoka | | Nagasaki | 3 | Nagasaki,Sasebo,Isahaya | | Okayama | 3 | Okayama,Kurashiki,Tsuyama | | Okinawa | 3 | Naha,Okinawa,Urasoe | | Shiga | 3 | Otsu,Kusatsu,Hikone | | Ishikawa | 2 | Kanazawa,Komatsu | | Kumamoto | 2 | Kumamoto,Yatsushiro | | Miyagi | 2 | Sendai,Ishinomaki | | Oita | 2 | Oita,Beppu | | Tottori | 2 | Tottori,Yonago | | Toyama | 2 | Toyama,Takaoka | | Akita | 1 | Akita | | Fukui | 1 | Fukui | | Iwate | 1 | Morioka | | Kagawa | 1 | Takamatsu | | Kagoshima | 1 | Kagoshima | | Kochi | 1 | Kochi | | Saga | 1 | Saga | | Shimane | 1 | Matsue | | Tokushima | 1 | Tokushima | | Wakayama | 1 | Wakayama | | Yamanashi | 1 | Kofu | +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 47 rows in set (0.00 sec)
GROUP_CONCAT
+ ORDER BY
を試す
GROUP_CONCAT
は関数内部での ORDER BY
をサポートしているため,以下のように SQL を書くと連結した文字列を並び替えることができる.
SELECT District, COUNT(Name), GROUP_CONCAT(Name ORDER BY Name) FROM city WHERE CountryCode = 'JPN' GROUP BY District ORDER BY COUNT(Name) DESC;
今回は「日本の都市名」を昇順に並び替えている.
mysql> SELECT District, COUNT(Name), GROUP_CONCAT(Name ORDER BY Name) -> FROM city -> WHERE CountryCode = 'JPN' -> GROUP BY District -> ORDER BY COUNT(Name) DESC; +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | District | COUNT(Name) | GROUP_CONCAT(Name ORDER BY Name) | +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Osaka | 22 | Daito,Habikino,Higashiosaka,Hirakata,Ibaraki,Ikeda,Izumi,Izumisano,Kadoma,Kawachinagano,Kishiwada,Matsubara,Minoo,Moriguchi,Neyagawa,Osaka,Sakai,Suita,Takatsuki,Tondabayashi,Toyonaka,Yao | | Saitama | 21 | Ageo,Asaka,Fujimi,Fukaya,Higashimatsuyama,Iruma,Iwatsuki,Kasukabe,Kawagoe,Kawaguchi,Koshigaya,Kumagaya,Misato,Niiza,Omiya,Sakado,Sayama,Soka,Toda,Tokorozawa,Urawa | | Chiba | 19 | Abiko,Chiba,Funabashi,Ichihara,Ichikawa,Kamagaya,Kashiwa,Kimitsu,Kisarazu,Matsudo,Mobara,Nagareyama,Narashino,Narita,Nishio,Noda,Sakura,Urayasu,Yachiyo | | Tokyo-to | 18 | Akishima,Chofu,Fuchu,Hachioji,Higashikurume,Higashimurayama,Hino,Hitachinaka,Hoya,Kodaira,Koganei,Kokubunji,Machida,Mitaka,Musashino,Ome,Tachikawa,Tokyo | | Aichi | 15 | Anjo,Handa,Ichinomiya,Inazawa,Kariya,Kasugai,Komaki,Konan,Nagoya,Okazaki,Seto,Tokai,Toyohashi,Toyokawa,Toyota | | Kanagawa | 15 | Atsugi,Chigasaki,Ebina,Fujisawa,Hadano,Hiratsuka,Isehara,Jokohama [Yokohama],Kamakura,Kawasaki,Odawara,Sagamihara,Yamato,Yokosuka,Zama | | Hyogo | 11 | Akashi,Amagasaki,Himeji,Itami,Kakogawa,Kawanishi,Kobe,Nishinomiya,Sanda,Takarazuka,Takasago | | Hokkaido | 10 | Asahikawa,Ebetsu,Hakodate,Kitami,Kushiro,Muroran,Obihiro,Otaru,Sapporo,Tomakomai | | Shizuoka | 9 | Fuji,Fujieda,Fujinomiya,Hamamatsu,Mishima,Numazu,Shimizu,Shizuoka,Yaizu | | Mie | 6 | Ise,Kuwana,Matsusaka,Suzuka,Tsu,Yokkaichi | | Yamaguchi | 6 | Hofu,Iwakuni,Shimonoseki,Tokuyama,Ube,Yamaguchi | | Fukuoka | 5 | Fukuoka,Kasuga,Kitakyushu,Kurume,Omuta | | Gumma | 5 | Isesaki,Kiryu,Maebashi,Ota,Takasaki | | Hiroshima | 5 | Fukuyama,Higashihiroshima,Hiroshima,Kure,Onomichi | | Ibaragi | 5 | Hitachi,Mito,Tama,Tsuchiura,Tsukuba | | Fukushima | 4 | Aizuwakamatsu,Fukushima,Iwaki,Koriyama | | Gifu | 4 | Gifu,Kakamigahara,Ogaki,Tajimi | | Kyoto | 4 | Kameoka,Kioto,Maizuru,Uji | | Nagano | 4 | Iida,Matsumoto,Nagano,Ueda | | Nara | 4 | Ikoma,Kashihara,Nara,Yamatokoriyama | | Niigata | 4 | Joetsu,Kashiwazaki,Nagaoka,Niigata | | Tochigi | 4 | Ashikaga,Kanuma,Oyama,Utsunomiya | | Yamagata | 4 | Sakata,Tsuruoka,Yamagata,Yonezawa | | Aomori | 3 | Aomori,Hachinohe,Hirosaki | | Ehime | 3 | Imabari,Matsuyama,Niihama | | Miyazaki | 3 | Miyakonojo,Miyazaki,Nobeoka | | Nagasaki | 3 | Isahaya,Nagasaki,Sasebo | | Okayama | 3 | Kurashiki,Okayama,Tsuyama | | Okinawa | 3 | Naha,Okinawa,Urasoe | | Shiga | 3 | Hikone,Kusatsu,Otsu | | Ishikawa | 2 | Kanazawa,Komatsu | | Kumamoto | 2 | Kumamoto,Yatsushiro | | Miyagi | 2 | Ishinomaki,Sendai | | Oita | 2 | Beppu,Oita | | Tottori | 2 | Tottori,Yonago | | Toyama | 2 | Takaoka,Toyama | | Akita | 1 | Akita | | Fukui | 1 | Fukui | | Iwate | 1 | Morioka | | Kagawa | 1 | Takamatsu | | Kagoshima | 1 | Kagoshima | | Kochi | 1 | Kochi | | Saga | 1 | Saga | | Shimane | 1 | Matsue | | Tokushima | 1 | Tokushima | | Wakayama | 1 | Wakayama | | Yamanashi | 1 | Kofu | +-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 47 rows in set (0.00 sec)
GROUP_CONCAT
+ DISTINCT
を試す
次にデータの重複を考える.以下の SQL だと「地域名」が,例えば Caribbean,Caribbean,Caribbean
のように重複してしまう.
SELECT Continent, GROUP_CONCAT(Region) FROM country GROUP BY Continent;
GROUP_CONCAT
は関数内部での DISTINCT
をサポートしているため,以下のように SQL を書くと,データの重複を除外することができる.
SELECT Continent, GROUP_CONCAT(DISTINCT Region) FROM country GROUP BY Continent;
実際に「地域名」の重複を除外できている.
mysql> SELECT Continent, GROUP_CONCAT(DISTINCT Region) -> FROM country -> GROUP BY Continent; +---------------+-------------------------------------------------------------------------------------------------+ | Continent | GROUP_CONCAT(DISTINCT Region) | +---------------+-------------------------------------------------------------------------------------------------+ | Asia | Eastern Asia,Middle East,Southeast Asia,Southern and Central Asia | | Europe | Baltic Countries,British Islands,Eastern Europe,Nordic Countries,Southern Europe,Western Europe | | North America | Caribbean,Central America,North America | | Africa | Central Africa,Eastern Africa,Northern Africa,Southern Africa,Western Africa | | Oceania | Australia and New Zealand,Melanesia,Micronesia,Micronesia/Caribbean,Polynesia | | Antarctica | Antarctica | | South America | South America | +---------------+-------------------------------------------------------------------------------------------------+ 7 rows in set (0.00 sec)
GROUP_CONCAT
+ DISTINCT
+ SEPARATOR
を試す
GROUP_CONCAT
のデフォルト区切り文字は ,
になっている.データに ,
が入っている可能性がある場合など,意図的に区切り文字を変更する場合は SEPARATOR
を設定する.
SELECT Continent, GROUP_CONCAT(DISTINCT Region SEPARATOR ' ? ') FROM country GROUP BY Continent;
以下は区切り文字を ?
にして SQL を実行している.
mysql> SELECT Continent, GROUP_CONCAT(DISTINCT Region SEPARATOR ' ? ') -> FROM country -> GROUP BY Continent; +---------------+-----------------------------------------------------------------------------------------------------------+ | Continent | GROUP_CONCAT(DISTINCT Region SEPARATOR ' ? ') | +---------------+-----------------------------------------------------------------------------------------------------------+ | Asia | Eastern Asia ? Middle East ? Southeast Asia ? Southern and Central Asia | | Europe | Baltic Countries ? British Islands ? Eastern Europe ? Nordic Countries ? Southern Europe ? Western Europe | | North America | Caribbean ? Central America ? North America | | Africa | Central Africa ? Eastern Africa ? Northern Africa ? Southern Africa ? Western Africa | | Oceania | Australia and New Zealand ? Melanesia ? Micronesia ? Micronesia/Caribbean ? Polynesia | | Antarctica | Antarctica | | South America | South America | +---------------+-----------------------------------------------------------------------------------------------------------+ 7 rows in set (0.00 sec)
group_concat_max_len
システム変数
MySQL の group_concat_max_len
システム変数は GROUP_CONCAT
が返す連結した文字列の最大バイト数を意味している.デフォルトは「1024」で,最小値は「4」を設定することができる.
実際に SHOW VARIABLES
で確認すると「1024」になっていることを確認できる.
mysql> SHOW VARIABLES LIKE 'group_concat_max_len'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | group_concat_max_len | 1024 | +----------------------+-------+ 1 row in set (0.01 sec)
今回は意図的に最小値「4」に設定すると,GROUP_CONCAT
の連結した文字列を「4」バイトに制限できる.
mysql> SET SESSION group_concat_max_len = 4; Query OK, 0 rows affected (0.00 sec) mysql> SELECT Continent, GROUP_CONCAT(DISTINCT Region) -> FROM country -> GROUP BY Continent; +---------------+-------------------------------+ | Continent | GROUP_CONCAT(DISTINCT Region) | +---------------+-------------------------------+ | Asia | East | | Europe | Balt | | North America | Cari | | Africa | Cent | | Oceania | Aust | | Antarctica | Anta | | South America | Sout | +---------------+-------------------------------+ 7 rows in set, 7 warnings (0.00 sec)
まとめ
- MySQL を使う場合
GROUP BY
だけではなくGROUP_CONCAT
も知っておくと便利 GROUP_CONCAT
は関数内部でORDER BY
やDISTINCT
やSEPARATOR
をサポートしているGROUP_CONCAT
の結果列のバイト数を調整する場合には,MySQL のシステム変数group_concat_max_len
を設定する