Groonga 4.0.1 has been released

Groonga 4.0.1 has been released

Groonga 4.0.1 has been released!

How to install: Install

There are following topics in this release.


  Resolved increasing the size of database issue
  Supported weight vector column
  Supported adjuster option in select command


Resolved increasing the size of database issue

In this release, Groonga has resolved that the size of database increase
by updating.

Here is the history of suppressing the size of database:


  3.1.0 - Added GRN_JA_SKIP_SAME_VALUE_PUT environment variable


It skips updating database if the value is same. This feature is marked
as experimental.


  3.1.2 - Enable GRN_JA_SKIP_SAME_VALUE_PUT=yes by default


It forces to enable above flag by default because this feature is
reasonable effects
 to suppress the size of database.

In contrast to previous approach, Groonga can manage to use variable
length more effectively so It fixes the issue about increasing the size
of database.

Note that you need to recreate the database to suppress the size of
database feature.

Here is the summary:


  You can open the database of previous versions by Groonga 4.0.1
or later.
  But, you can't open database which is created Groonga 4.0.1, by
previous versions.
  You can use this feature by recreating the database.


In fact, ongaeshi who is developer of
Milkode had tested the impact
of this feature. Here is the verified graph about increasing the size of
database. (Thanks to ongaeshi!!!)

[](http://d3j5vwomefv46c.cloudfront.net/photos/large/843929509.png?1394947803)

It reveals the fact that if you use previous database as
is(4.0.0-72-continue), the size of database just increases, but if you
recreate database, you can suppress the size of one (4.0.0-72-new).

Supported weight vector column

In the Groonga 4.0.1 release, vector column can store multiple pairs of
key and value. It is weight vector column.

For example, if you want to store user's attribute as tag, you need to
use COLUMN_VECTOR as following:

column_create Users tags COLUMN_VECTOR ShortText



But, it is not enough if attribute has a deviation, so you need to use
another columns to store the value of weight for each attriubutes as
alternative way. (Here is the example schema definition of alternative
way)

column_create Users tags COLUMN_VECTOR ShortText
column_create Users tags_A COLUMN_SCALAR Int32
column_create Users tags_B COLUMN_SCALAR Int32
column_create Users tags_C COLUMN_SCALAR Int32...



By supporting weight vector column, Groonga can unify such columns into
one column. Use 'WITH_WEIGHT' flag in column definition.

column_create Users tags COLUMN_VECTOR|WITH_WEIGHT ShortText



You can store pair of key and value to vector column.

{"Tag A":weight1, "Tag B":weight2, "Tag C":weight3, ...}



Supported adjuster option in select command

In this release, Groonga supported adjuster option in select command.

In the previous versions, you can treat weight for each column by using
match_column.

Here is the difference of match_column and adjuster option:


  match_column - treat weight for matched column
  adjuster - treat weight for specific key of column


In combination with weight vector column support, you can customize
search results.

For example, consiter to list up the person who use Groonga well. Assume
that the value of rate is stored into weight vector column.

Here is the sample schema definition:

table_create User TABLE_HASH_KEY ShortText
column_create User weight COLUMN_VECTOR|WITH_WEIGHT ShortText
column_create User tags COLUMN_VECTOR ShortText

table_create Weight TABLE_HASH_KEY ShortText
column_create Weight weight_index COLUMN_INDEX|WITH_WEIGHT User weight

table_create Tag TABLE_PAT_KEY ShortText
column_create Tag tags_index COLUMN_INDEX User tags



Here is the way to load sample data:

load --table User
[
  {
    "_key":"alice",
    "weight":{"Groonga":30, "Mroonga":20},
    "tags": ["Groonga", "Mroonga"]
  },
  {
    "_key":"bob",
    "weight":{"Groonga":50},
    "tags": ["Groonga"]
  },
  {
    "_key":"carol",
    "weight":{"Groonga":40,"Mroonga":30},
    "tags": ["Groonga", "Mroonga"]
  }
]



In the simple way, you can just use filter option to get the person who
use "Groonga".

select User --output_columns _key,_score,* --sortby -_score --filter 'tags @ "Groonga"'



But, we want to consider the rate in this case, so we need to use
adjuster option for this purpose.

select User --output_columns _key,_score,* --sortby -_score --filter 'tags @ "Groonga"' --adjuster 'weight @ "Groonga" * 10'



Here is the parameter for adjuster:

'weight @ "Groonga" * 10'



It means that calculate the value of weight for weight column which use
"Groonga" as keyword, if "Groonga" exists, multiply 10 for it.

As a result, "bob" is the top of the result:

["bob",511,["Groonga"],{"Groonga":50}],
["carol",411,["Groonga","Mroonga"],{"Groonga":40,"Mroonga":30}],
["alice",311,["Groonga","Mroonga"],{"Groonga":30,"Mroonga":20}]



If you consider the person who use not only "Groonga" but also
"Mroonga", specify "Mroonga" for adjuster:

select User --output_columns _key,_score,* --sortby -_score --filter 'tags @ "Groonga"' --adjuster 'weight @ "Mroonga" * 10'



As a result, "carol" is the top of the result:

["carol",311,["Groonga","Mroonga"],{"Groonga":40,"Mroonga":30}],
["alice",211,["Groonga","Mroonga"],{"Groonga":30,"Mroonga":20}],
["bob",1,["Groonga"],{"Groonga":50}]



Conclusion

See Release 4.0.1 2014/03/29 about
detailed changes since 4.0.0.

Let's search by Groonga!
Edit Post Back Destroy Post