first_value/last_value solution in MonetDB

Hi, Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system. I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology. Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series. I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions: first_value and last_value, which allows to do exactly this. Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution. I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate. Thank you in advance!John

Let me answer my own question. The preliminary solution would be: WITH open(sdate, open, rnk) as ( select sdate, open, row_number() over (partition by sdate order by qtime asc) as rnk from marketdata.quotes where sym='SPY' ), close as ( select sdate, close, row_number() over (partition by sdate order by qtime desc) as rnk from marketdata.quotes where sym='SPY' ), hilw(sdate,high,low) as ( select sdate, max(high),min(low) from marketdata.quotes where sym='SPY' group by sdate ) select open.sdate, open.open, hilw.high, hilw.low, close.close from open,close, hilw where open.rnk=1 and close.rnk=1 and open.sdate=close.sdate and open.sdate=hilw.sdate; But i suspect this SQL is not very optimal, since I query the same data partition many times. If there is any optimization possible, please let me know. Thanks,JohnFrom: caroptions@hotmail.com To: users-list@monetdb.org Subject: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 04:28:07 +0000 Hi, Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system. I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology. Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series. I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions: first_value and last_value, which allows to do exactly this. Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution. I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate. Thank you in advance!John

On 09 Apr 2015, at 18:45, Caroptions Caroptions <caroptions@hotmail.com> wrote:
Let me answer my own question.
The preliminary solution would be:
WITH open(sdate, open, rnk) as ( select sdate, open, row_number() over (partition by sdate order by qtime asc) as rnk from marketdata.quotes where sym='SPY' ), close as ( select sdate, close, row_number() over (partition by sdate order by qtime desc) as rnk from marketdata.quotes where sym='SPY' ), hilw(sdate,high,low) as ( select sdate, max(high),min(low) from marketdata.quotes where sym='SPY' group by sdate ) select open.sdate, open.open, hilw.high, hilw.low, close.close from open,close, hilw where open.rnk=1 and close.rnk=1 and open.sdate=close.sdate and open.sdate=hilw.sdate;
But i suspect this SQL is not very optimal, since I query the same data partition many times. If there is any optimization possible, please let me know.
Windowing functions are indeed what you can use for this case. You can just try how far you can go before the performance will become a problem. Would it be possible to assign each partition an ID so that you can group by that id and select min/max row_number as open and close? Regards, Jennie
Thanks, John
From: caroptions@hotmail.com To: users-list@monetdb.org Subject: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 04:28:07 +0000
Hi,
Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system.
I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology.
Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series.
I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions:
first_value and last_value, which allows to do exactly this.
Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution.
I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate.
Thank you in advance! John _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi Jennie, in my case "sdate" is such id and I partition by it or you've meant something else? Thank you for helping!JohnFrom: Y.Zhang@cwi.nl Subject: Re: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 23:34:55 +0200 To: users-list@monetdb.org On 09 Apr 2015, at 18:45, Caroptions Caroptions <caroptions@hotmail.com> wrote: Let me answer my own question. The preliminary solution would be: WITH open(sdate, open, rnk) as ( select sdate, open, row_number() over (partition by sdate order by qtime asc) as rnk from marketdata.quotes where sym='SPY' ), close as ( select sdate, close, row_number() over (partition by sdate order by qtime desc) as rnk from marketdata.quotes where sym='SPY' ), hilw(sdate,high,low) as ( select sdate, max(high),min(low) from marketdata.quotes where sym='SPY' group by sdate ) select open.sdate, open.open, hilw.high, hilw.low, close.close from open,close, hilw where open.rnk=1 and close.rnk=1 and open.sdate=close.sdate and open.sdate=hilw.sdate; But i suspect this SQL is not very optimal, since I query the same data partition many times. If there is any optimization possible, please let me know. Windowing functions are indeed what you can use for this case. You can just try how far you can go before the performance will become a problem. Would it be possible to assign each partition an ID so that you can group by that id and select min/max row_number as open and close? Regards,Jennie Thanks,John From: caroptions@hotmail.com To: users-list@monetdb.org Subject: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 04:28:07 +0000 Hi, Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system. I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology. Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series. I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions: first_value and last_value, which allows to do exactly this. Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution. I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate. Thank you in advance!John _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

On 09 Apr 2015, at 23:43, Caroptions Caroptions <caroptions@hotmail.com> wrote:
Hi Jennie,
in my case "sdate" is such id and I partition by it or you've meant something else?
You can GROUP BY sdate, select min/max of qtime, then join with the base table to het corresponding open/close value. Not sure if this is faster, but it's probably not prettier than the row_number solution. WITH stmts in mdb are not computed before hand, instead, they are inlined and optimised. You can have a look at the PLAN and EXPLAIN output to see how much of the two WITHs are recomputed. Finally, just try the queries to see how they perfom. Regards, Jennie
Thank you for helping! John
From: Y.Zhang@cwi.nl Subject: Re: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 23:34:55 +0200 To: users-list@monetdb.org
On 09 Apr 2015, at 18:45, Caroptions Caroptions <caroptions@hotmail.com> wrote:
Let me answer my own question.
The preliminary solution would be:
WITH open(sdate, open, rnk) as ( select sdate, open, row_number() over (partition by sdate order by qtime asc) as rnk from marketdata.quotes where sym='SPY' ), close as ( select sdate, close, row_number() over (partition by sdate order by qtime desc) as rnk from marketdata.quotes where sym='SPY' ), hilw(sdate,high,low) as ( select sdate, max(high),min(low) from marketdata.quotes where sym='SPY' group by sdate ) select open.sdate, open.open, hilw.high, hilw.low, close.close from open,close, hilw where open.rnk=1 and close.rnk=1 and open.sdate=close.sdate and open.sdate=hilw.sdate;
But i suspect this SQL is not very optimal, since I query the same data partition many times. If there is any optimization possible, please let me know.
Windowing functions are indeed what you can use for this case. You can just try how far you can go before the performance will become a problem.
Would it be possible to assign each partition an ID so that you can group by that id and select min/max row_number as open and close?
Regards, Jennie
Thanks, John
From: caroptions@hotmail.com To: users-list@monetdb.org Subject: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 04:28:07 +0000
Hi,
Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system.
I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology.
Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series.
I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions:
first_value and last_value, which allows to do exactly this.
Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution.
I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate.
Thank you in advance! John _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi Jennie, Thank you for hint! One more question on this topic. In my understanding "WITH" is not supported as part of function, returning "TABLE" at the moment? I mean: function fun_name() returns table(columns ..)Begin return table (with ....);End Thanks,John Subject: Re: first_value/last_value solution in MonetDB From: Y.Zhang@cwi.nl Date: Fri, 10 Apr 2015 09:29:43 +0200 To: users-list@monetdb.org On 09 Apr 2015, at 23:43, Caroptions Caroptions <caroptions@hotmail.com> wrote: Hi Jennie, in my case "sdate" is such id and I partition by it or you've meant something else? You can GROUP BY sdate, select min/max of qtime, then join with the base table to het corresponding open/close value. Not sure if this is faster, but it's probably not prettier than the row_number solution. WITH stmts in mdb are not computed before hand, instead, they are inlined and optimised. You can have a look at the PLAN and EXPLAIN output to see how much of the two WITHs are recomputed. Finally, just try the queries to see how they perfom. Regards,Jennie Thank you for helping!John From: Y.Zhang@cwi.nl Subject: Re: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 23:34:55 +0200 To: users-list@monetdb.org On 09 Apr 2015, at 18:45, Caroptions Caroptions <caroptions@hotmail.com> wrote: Let me answer my own question. The preliminary solution would be: WITH open(sdate, open, rnk) as ( select sdate, open, row_number() over (partition by sdate order by qtime asc) as rnk from marketdata.quotes where sym='SPY' ), close as ( select sdate, close, row_number() over (partition by sdate order by qtime desc) as rnk from marketdata.quotes where sym='SPY' ), hilw(sdate,high,low) as ( select sdate, max(high),min(low) from marketdata.quotes where sym='SPY' group by sdate ) select open.sdate, open.open, hilw.high, hilw.low, close.close from open,close, hilw where open.rnk=1 and close.rnk=1 and open.sdate=close.sdate and open.sdate=hilw.sdate; But i suspect this SQL is not very optimal, since I query the same data partition many times. If there is any optimization possible, please let me know. Windowing functions are indeed what you can use for this case. You can just try how far you can go before the performance will become a problem. Would it be possible to assign each partition an ID so that you can group by that id and select min/max row_number as open and close? Regards,Jennie Thanks,John From: caroptions@hotmail.com To: users-list@monetdb.org Subject: first_value/last_value solution in MonetDB Date: Thu, 9 Apr 2015 04:28:07 +0000 Hi, Till I am trying to resolve my business problem I discovered the segfault bug which I have reported in MonetDB bug tracking system. I believe Monetdb is great product and team is moving in right direction utilizing latests advances in technology. Nevertheless my "business" problem is to build OHLC Time Series for longer time frame from shorter ohlc time series. I need partition by interval (5 minutes, 30 minutes, 1 hour, 1 day) and get first record of open, max of high, min of low and last of close from let's say 1 minute ohlc series. In postgresql there are 2 functions: first_value and last_value, which allows to do exactly this. Are there any function/approach/techniques which could be utilized to have the same results in MonetDB, I could do this aggregation on client side but this way I will not utilize DB potential and mainly the network traffic will be prohibitive for the solution. I am sure this question is already addressed somewhere, but I couldn't find, the solution should be trivial and pretty standard but I no luck on my side, if you push me in right directly I will really appreciate. Thank you in advance!John _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Caroptions Caroptions
-
Ying Zhang