2

Creating a date dimension table in PostgreSQL

 1 year ago
source link: https://gist.github.com/duffn/38449526e00abb47f4ec292f0491313d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

FROM clause can be simplified to something like:
(SELECT dat::DATE as datum
FROM GENERATE_SERIES('2012-01-01'::DATE, '2029-12-31'::DATE, '1 day') dat) DQ

This actually gives wrong information for certain dates. 2017-01-01 is shown as having year_actual of 2016. Similarly, the first three days of 2016 are listed in 2015.

Do you have MySQL version of query to populate data in d_date table? (INSERT Query)

Szeliga commented Dec 27, 2018

edited

This actually gives wrong information for certain dates. 2017-01-01 is shown as having year_actual of 2016. Similarly, the first three days of 2016 are listed in 2015.

@tayloramurphy according to the documentation this is correct, because the script extracts isoyear:

The ISO 8601 week-numbering year that the date falls in (not applicable to intervals)
SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-01');
Result: 2005
SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02');
Result: 2006
Each ISO 8601 week-numbering year begins with the Monday of the week containing the 4th of January, so in early January or late December the ISO year may be different from the Gregorian year. See the week field for more information.

Same thing goes for week_of_year column:

week
The number of the ISO 8601 week-numbering week of the year. By definition, ISO weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year.
In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005, while 2012-12-31 is part of the first week of 2013. It's recommended to use the isoyear field together with week to get consistent results.
SELECT EXTRACT(WEEK FROM TIMESTAMP '2001-02-16 20:38:40');
Result: 7

https://gist.github.com/Szeliga/0cd925bdbbe8e9a2c9c115ab0fcca1ca slightly modified version

will-e-yams commented Dec 26, 2019

edited

i'm not 100% on week of year ISO definition, but i can't imagine that dates at the beginning and end of the same year would have the same ISO week. consider the following:

image

Week 52, day 6 and 7 of year 2000 both Jan 1st/2nd and Dec 30/31?

changing the calculation to

, EXTRACT(isoyear from datum) || TO_CHAR(datum, '"-W"IW-') || EXTRACT(isodow from datum) as week_of_year_iso

yields what i think are the expected result

image

There's also an error in first/last day of year.

image

If the date 2000-01-02 is actually in year 1999, then the last day of the year can't possibly be 1999-12-31; it has to be 2000-01-02. Unless you intend that field to always/only be Jan 1/Dec 31 of the year on the timestamp, in which case I'd say that column has literally no value.

Author

duffn commented Feb 7, 2020

Thank you for the comments all. I've made updates that I believe corrects all the addressed, but don't hesitate to let me know if I've missed something.

Changed

EXTRACT(ISOYEAR FROM datum) || TO_CHAR(datum, '"-W"IW-') || EXTRACT(ISODOW FROM datum) AS week_of_year_iso,
TO_DATE(EXTRACT(YEAR FROM datum) || '-01-01', 'YYYY-MM-DD') AS first_day_of_year,
TO_DATE(EXTRACT(YEAR FROM datum) || '-12-31', 'YYYY-MM-DD') AS last_day_of_year,

Thank you for the script! I made few adjustments to use in Portuguese. Maybe you can improve your script with below changes:

  1. Before insert, add set lc_time to 'desired language'; ; eg: set lc_time to 'pt_BR.UTF-8';
  2. Change TO_CHAR(datum, 'Day') AS day_name to TO_CHAR(datum, 'TMDay') AS day_name,
  3. Change TO_CHAR(datum, 'Month') AS month_name, to TO_CHAR(datum, 'TMMonth') AS month_name,

I hope this can help the script to be improved.

Thank you!

GENERATE_SERIES stop value:

  1. 292190 - 2769-12-28
  2. 2147483 - 7849-08-11
  3. 21474836 - Fails: ERROR: value too long for type character(6)

I am trying to generate dates to 9999-12-31 .NET 5 DateTime.MaxValue Field

What is the maximum date that I can generate?

Ozencb commented Apr 21, 2021

edited

There seems to be a problem with the first or the last few days of any year having wrong year_actual. Is there a fix for this or am I doing something wrong?

SELECT year_actual, date_actual
FROM dim_date
WHERE year_actual <> EXTRACT(YEAR FROM date_actual);
year_actual date_actual
2020 2019-12-30
2020 2019-12-31
2020 2021-01-01
2020 2021-01-02
2020 2021-01-03

Edit:
This answer seems to be doing the job

Author

duffn commented Jul 25, 2021

There seems to be a problem with the first or the last few days of any year having wrong year_actual. Is there a fix for this or am I doing something wrong?

SELECT year_actual, date_actual
FROM dim_date
WHERE year_actual <> EXTRACT(YEAR FROM date_actual);

year_actual date_actual
2020 2019-12-30
2020 2019-12-31
2020 2021-01-01
2020 2021-01-02
2020 2021-01-03

Edit:
This answer seems to be doing the job

Thanks. I've updated the gist.

Author

duffn commented Jul 25, 2021

Thank you for the script! I made few adjustments to use in Portuguese. Maybe you can improve your script with below changes:

1. Before insert, add **set lc_time to 'desired language';** ; eg: set lc_time to 'pt_BR.UTF-8';

2. Change **TO_CHAR(datum, 'Day') AS day_name** to **TO_CHAR(datum, 'TMDay') AS day_name,**

3. Change **TO_CHAR(datum, 'Month') AS month_name,** to **TO_CHAR(datum, 'TMMonth') AS month_name,**

I hope this can help the script to be improved.

Thank you!

That's great, thank you. I've updated the gist.

Thank you for the script! I made few adjustments to use in Portuguese. Maybe you can improve your script with below changes:

  1. Before insert, add set lc_time to 'desired language'; ; eg: set lc_time to 'pt_BR.UTF-8';
  2. Change TO_CHAR(datum, 'Day') AS day_name to TO_CHAR(datum, 'TMDay') AS day_name,
  3. Change TO_CHAR(datum, 'Month') AS month_name, to TO_CHAR(datum, 'TMMonth') AS month_name,

I hope this can help the script to be improved.

Thank you!

Just missed one:
4. Change TO_CHAR(datum, 'Mon') AS month_name_abbreviated, to TO_CHAR(datum, 'TMMon') AS month_name_abbreviated,

And maybe a little "comment" to remind people to "manually translate" the quarter names?

Thanks for this anyways, great script!

miovee commented Dec 28, 2021

pls add Financial calendar for a part also. many country's fin date start from jul-1 as start of 1st quarter and next year jun-30 as end of last quarter. year metinon as 2020-21 , 2021-22 .

I love this code! One question, I am wondering if it is possible to extend the table to work from 1900 to say 2049. Is it as simple as setting the start date to 1900-01-01 and then extending the series? I feel like doing so may break some of the logic


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK