MonetDB vs Vertica CE

Luciano Sasso

16 Apr 2015 16 Apr '15

3:04 p.m.

Hi, The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB? -- Luciano Sasso Vieira Data Scientist & Solutions Architect luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br <http://www.gsgroup.com.br> --- Este email foi escaneado pelo Avast antivírus. http://www.avast.com

Attachments:

attachment.html (text/html — 2.9 KB)
Slice_01.png (image/png — 5.0 KB)
Slice_02.jpg (image/jpeg — 1.4 KB)
Slice_04.png (image/png — 2.8 KB)
Slice_06.png (image/png — 9.9 KB)

Show replies by date

shamsul hassan

16 Apr 16 Apr

3:16 p.m.

one quick thing which comes to me is that Vertica is free only till 1TB with community edition .. but after that it has crazy licensing fees but that not the case with MonetDb .. you can freely download it an use it with any amount of data. Vertica is more Entreprise level and have very detailed documentation but for Monetdb you have to make it work and try to figure out things by yourself and mostly by community. Good thing is Monetdb is the only Opensource database which can challenge other paid Columnnar based databases. Community can add more points here .. Thanks On Thu, Apr 16, 2015 at 2:04 PM, Luciano Sasso <luciano@gsgroup.com.br> wrote:

...

Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br

------------------------------ [image: Avast logo] <http://www.avast.com/>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Ruben Silva

3:37 p.m.

Hello Luciano, I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case. Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance. MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use. I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler. Cumprimentos (Regards), 2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>:

...

Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br

------------------------------ [image: Avast logo] <http://www.avast.com/>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

shamsul hassan

3:47 p.m.

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins and I cannot truncate and load the whole table again. Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table. I raised the question few days before in Community list but didn't get any useful answer. If i get any workaround for these UPSERTS the i would be as happy as pig in dirty water :) Thanks On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva < ruben.silva@cortex-intelligence.com> wrote:

...

Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>:

...
Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br

------------------------------ [image: Avast logo] <http://www.avast.com/>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

José Raúl Pérez Rodríguez

4:34 p.m.

Hi Luciana, I am involved in the same decision, Vertica or MonetDB. I choose Vertica for only one reason, scalability. Despite I will use only de Community Edition, that limit scaling up to 3 nodes,the problem with MonetDB is that not support full functionalities at cluster lever, for example queries on a distributed table, can't be sorted, I maked a question to the developers mail list, and the answer was that this is not supported yet, they want to do it, but can't be sure when. I would like to work with monet but at production level, this is a serious limitation. Greetings On 4/16/15, shamsul hassan <shamsulbuddy@gmail.com> wrote:

...

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins and I cannot truncate and load the whole table again. Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

I raised the question few days before in Community list but didn't get any useful answer.

If i get any workaround for these UPSERTS the i would be as happy as pig in dirty water :)

Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva < ruben.silva@cortex-intelligence.com> wrote:

...
Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>:

...
Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br

------------------------------ [image: Avast logo] <http://www.avast.com/>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Miguel Ping

4:42 p.m.

Why don't you try presto with monetdb? Should be easy to write a connector. https://prestodb.io On 04/16/2015 03:34 PM, José Raúl Pérez Rodríguez wrote:

...

Hi Luciana,

I am involved in the same decision, Vertica or MonetDB.

I choose Vertica for only one reason, scalability.

Despite I will use only de Community Edition, that limit scaling up to 3 nodes,the problem with MonetDB is that not support full functionalities at cluster lever, for example queries on a distributed table, can't be sorted, I maked a question to the developers mail list, and the answer was that this is not supported yet, they want to do it, but can't be sure when.

I would like to work with monet but at production level, this is a serious limitation.

Greetings

On 4/16/15, shamsul hassan <shamsulbuddy@gmail.com> wrote:

...
there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins and I cannot truncate and load the whole table again. Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

I raised the question few days before in Community list but didn't get any useful answer.

If i get any workaround for these UPSERTS the i would be as happy as pig in dirty water :)

Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva < ruben.silva@cortex-intelligence.com> wrote:

...
Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>:

...
Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833 | cel: 17 99706-9335 www.gsgroup.com.br

------------------------------ [image: Avast logo] <http://www.avast.com/>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Ying Zhang

17 Apr 17 Apr

11:11 a.m.

Hello Shamsul,

...

On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins

How many data (in #records or size) do you receive?

...

and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT?

...

Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

...

I raised the question few days before in Community list but didn't get any useful answer.

I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago...

...

If i get any workaround for these UPSERTS the i would be as happy as pig in dirty water :)

Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB. With kind regards, Jennie

...

Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva <ruben.silva@cortex-intelligence.com> wrote: Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>: Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- <Slice_01.png> <Slice_02.jpg> Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br | tel: 17 3353-0833 | cel: 17 99706-9335 <Slice_04.png> www.gsgroup.com.br <Slice_06.png>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Budulinek

11:34 a.m.

My 5 cents: I suspect in the context the UPSERT is misleading. What you need is rather a partitioning functionality in terms of what Oracle has. You need to break the data in to multiple partitions (based on whatever criteria as for example date range, ip range, etc). The DML/query is (that is in Oracle) transparent so you do not care where (ie what partition) the data goes/is physically but the database automatically assigns the data into a correct partition. Of course you have multiple options how treat the particular partition data from the performance perspective - you may want to refresh for example only last month of data. Now if you partition by month you can truncate only one month of data and load it again etc. I really miss this in MonetDB. Of course i may be wrong :) milan On Apr 17, 2015 11:11 AM, "Ying Zhang" <Y.Zhang@cwi.nl> wrote:

...

Hello Shamsul,

...
On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins

How many data (in #records or size) do you receive?

...
and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT?

...
Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

In case of new records, UPSERT is a syntactical convenience for, e.g., load the arrived records in a temp table, compute an except against the base table, and insert the results into base table.

In case of changed records, you probably have some customised definition of "changed" that calls for some computation to identify and update them?

...
I raised the question few days before in Community list but didn't get

any useful answer.

I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago...

...
If i get any workaround for these UPSERTS the i would be as happy as pig

in dirty water :)

Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB.

With kind regards,

Jennie

...
Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva <

...
Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in

ruben.silva@cortex-intelligence.com> wrote: the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

...
Vertica relies heavily on projections as a way to improve performance.

If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

...
MonetDB feeds on CPU and has no boundaries for its appetite, consuming

as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

...
I don't know your use case (and that is crucial for a good advice), but

if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

...
Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>: Hi,

The team of staff this questioning me about using Vertica, but it has

limitations on

...
the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- <Slice_01.png> <Slice_02.jpg> Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br | tel: 17 3353-0833 | cel: 17 99706-9335 <Slice_04.png> www.gsgroup.com.br <Slice_06.png>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Ying Zhang

11:48 a.m.

...

On Apr 17, 2015, at 11:34, Budulinek <budulinku.dejmihrasku@gmail.com> wrote:

My 5 cents: I suspect in the context the UPSERT is misleading. What you need is rather a partitioning functionality in terms of what Oracle has. You need to break the data in to multiple partitions (based on whatever criteria as for example date range, ip range, etc). The DML/query is (that is in Oracle) transparent so you do not care where (ie what partition) the data goes/is physically but the database automatically assigns the data into a correct partition.

This sounds very much like the new Merge Table future in MonetDB. Selection queries can be made very transparent. But you need to explicitly specify into which (sub-)table the data should be inserted.

...

Of course you have multiple options how treat the particular partition data from the performance perspective - you may want to refresh for example only last month of data. Now if you partition by month you can truncate only one month of data and load it again etc.

I really miss this in MonetDB.

Of course i may be wrong :)

milan

On Apr 17, 2015 11:11 AM, "Ying Zhang" <Y.Zhang@cwi.nl> wrote: Hello Shamsul,

...
On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins

How many data (in #records or size) do you receive?

...
and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT?

...
Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

In case of new records, UPSERT is a syntactical convenience for, e.g., load the arrived records in a temp table, compute an except against the base table, and insert the results into base table.

In case of changed records, you probably have some customised definition of "changed" that calls for some computation to identify and update them?

...
I raised the question few days before in Community list but didn't get any useful answer.

I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago...

...
If i get any workaround for these UPSERTS the i would be as happy as pig in dirty water :)

Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB.

With kind regards,

Jennie

...
Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva <ruben.silva@cortex-intelligence.com> wrote: Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

Vertica relies heavily on projections as a way to improve performance. If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

MonetDB feeds on CPU and has no boundaries for its appetite, consuming as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

I don't know your use case (and that is crucial for a good advice), but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>: Hi,

The team of staff this questioning me about using Vertica, but it has limitations on the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- <Slice_01.png> <Slice_02.jpg> Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br | tel: 17 3353-0833 | cel: 17 99706-9335 <Slice_04.png> www.gsgroup.com.br <Slice_06.png>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Budulinek

12:01 p.m.

Very good. Almost bingo. I can live with some manual work during the data entry. Thanks for the advance notice! milan On Apr 17, 2015 11:48 AM, "Ying Zhang" <Y.Zhang@cwi.nl> wrote:

...

...
On Apr 17, 2015, at 11:34, Budulinek <budulinku.dejmihrasku@gmail.com> wrote:

My 5 cents: I suspect in the context the UPSERT is misleading. What you need is rather a partitioning functionality in terms of what Oracle has. You need to break the data in to multiple partitions (based on whatever criteria as for example date range, ip range, etc). The DML/query is (that is in Oracle) transparent so you do not care where (ie what partition) the data goes/is physically but the database automatically assigns the data into a correct partition.

This sounds very much like the new Merge Table future in MonetDB. Selection queries can be made very transparent. But you need to explicitly specify into which (sub-)table the data should be inserted.

...
Of course you have multiple options how treat the particular partition data from the performance perspective - you may want to refresh for example only last month of data. Now if you partition by month you can truncate only one month of data and load it again etc.

I really miss this in MonetDB.

Of course i may be wrong :)

milan

On Apr 17, 2015 11:11 AM, "Ying Zhang" <Y.Zhang@cwi.nl> wrote: Hello Shamsul,

...
On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins

How many data (in #records or size) do you receive?

...
and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT?

...
Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

In case of new records, UPSERT is a syntactical convenience for, e.g., load the arrived records in a temp table, compute an except against the base table, and insert the results into base table.

In case of changed records, you probably have some customised definition of "changed" that calls for some computation to identify and update them?

...
I raised the question few days before in Community list but didn't get

any useful answer.

I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago...

...
If i get any workaround for these UPSERTS the i would be as happy as

pig in dirty water :)

Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB.

With kind regards,

Jennie

...
Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva <

...
Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in

ruben.silva@cortex-intelligence.com> wrote: the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

...
Vertica relies heavily on projections as a way to improve performance.

If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

...
MonetDB feeds on CPU and has no boundaries for its appetite, consuming

as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

...
I don't know your use case (and that is crucial for a good advice),

but if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

...
Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>: Hi,

The team of staff this questioning me about using Vertica, but it has

limitations on

...
the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- <Slice_01.png> <Slice_02.jpg> Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br | tel: 17 3353-0833 | cel: 17 99706-9335 <Slice_04.png> www.gsgroup.com.br <Slice_06.png>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

shamsul hassan

1:20 p.m.

...

On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in

Hello Shamsul, production is the lack of UPSERT operation.

...

In my case new data comes every 30 mins

How many data (in #records or size) do you receive? We receive around 500 K records per hour.

...

and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT? Here truncate and load mean .. if i try to insert this new data ( which contains new records plus records which have been changed in last hour) then it will throw error as that record already exists ..

...

Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

In case of new records, UPSERT is a syntactical convenience for, e.g., load the arrived records in a temp table, compute an except against the base table, and insert the results into base table. In case of changed records, you probably have some customised definition of "changed" that calls for some computation to identify and update them? I did follow this strategy where I am loading the new data in TEMP table , then based on unique keys deleting those records from main table and then inserting into main table with select * from temp table. One issue is that the few records(which are changed) will be unavailable in the main table temporarily till that Insert from temp table completes . apart from that its ok .

...

I raised the question few days before in Community list but didn't get

any useful answer. I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago... Here is the mail -- https://www.monetdb.org/pipermail/users-list/2015-January/007837.html

...

If i get any workaround for these UPSERTS the i would be as happy as pig

in dirty water :) Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB. I do agree on this point but I was just looking to solve this problem with highest availability of data something like MERGE in Teradata. With kind regards, Jennie On Fri, Apr 17, 2015 at 10:11 AM, Ying Zhang <Y.Zhang@cwi.nl> wrote:

...

Hello Shamsul,

...
On Apr 16, 2015, at 15:47, shamsul hassan <shamsulbuddy@gmail.com> wrote:

there is only one thing which has stopped me for using MonetDB in production is the lack of UPSERT operation. In my case new data comes every 30 mins

How many data (in #records or size) do you receive?

...
and I cannot truncate and load the whole table again.

What has "truncate and load the whole table" to do with workaround the lack of UPSERT?

...
Ideally an UPSERT would have helped here as it will only touch those records which are new or changed without truncating the whole table.

In case of new records, UPSERT is a syntactical convenience for, e.g., load the arrived records in a temp table, compute an except against the base table, and insert the results into base table.

In case of changed records, you probably have some customised definition of "changed" that calls for some computation to identify and update them?

...
I raised the question few days before in Community list but didn't get

any useful answer.

I probably overlooked it, but I don't recall any e-mail on the mailing list on the UPSERT topic in the last days. I do remember some truncate table discussions quite some time ago...

...
If i get any workaround for these UPSERTS the i would be as happy as pig

in dirty water :)

Please be aware that column-stores such as MonetDB has some fundamental differences than the row-stores you are probably very familiar with. And MonetDB has a rather different transaction scheme than many other DBMSs. Therefore, some advantages using UPSERT has in other DBMSs (such as the ones mentioned in https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition), may not apply in MonetDB.

With kind regards,

Jennie

...
Thanks

On Thu, Apr 16, 2015 at 2:37 PM, Ruben Silva <

...
Hello Luciano,

I don't have a full grasp of the new stuff introduced into MonetDB in

ruben.silva@cortex-intelligence.com> wrote: the last year, but as far as I know there are still major differences between the two systems for me the most important is scalability. Vertica has no single point of failure when working as a cluster. However the CE version is limited to three nodes, so scalability is also limited in that case.

...
Vertica relies heavily on projections as a way to improve performance.

If your data model is relatively stable it is something fairly easy to use, but if your data model changes at a high rate you will have a considerable amount of additional work managing the projections. Also if you use many projections they will hurt your DML statements performance.

...
MonetDB feeds on CPU and has no boundaries for its appetite, consuming

as much resources as possible in order to respond to a single statement. That can be very good or very bad, it depends on your use case. Vertica has resource pools that are used to regulate the stuff (IO, Ram, CPU) that each query is allowed to use.

...
I don't know your use case (and that is crucial for a good advice), but

if you are considering a scenario where you have a single machine for the database system, then go with MonetDB without any doubt. It is just faster and simpler.

...
Cumprimentos (Regards),

2015-04-16 14:04 GMT+01:00 Luciano Sasso <luciano@gsgroup.com.br>: Hi,

The team of staff this questioning me about using Vertica, but it has

limitations on

...
the CE version. Someone already used the Vertica has some advantage over MonetDB?

-- <Slice_01.png> <Slice_02.jpg> Luciano Sasso Vieira Data Scientist & Solutions Architect

luciano@gsgroup.com.br | tel: 17 3353-0833 | cel: 17 99706-9335 <Slice_04.png> www.gsgroup.com.br <Slice_06.png>

Este email foi escaneado pelo Avast antivírus. www.avast.com

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

3736

Age (days ago)

3737

Last active (days ago)

List overview

Download

10 comments

7 participants

participants (7)

Budulinek
José Raúl Pérez Rodríguez
Luciano Sasso
Miguel Ping
Ruben Silva
shamsul hassan
Ying Zhang