mirror of
https://github.com/ceph/ceph
synced 2025-02-22 18:47:18 +00:00
Merge pull request #40910 from galsalomon66/update_s3select_submodule_and_s3tests_head_10apr
rgw/s3select: update s3select submodule to last commit, (new features), update for test coverage(s3test)
This commit is contained in:
commit
5276c2a3d0
@ -69,13 +69,13 @@ Features Support
|
||||
+=================================+=================+=======================================================================+
|
||||
| Arithmetic operators | ^ * % / + - ( ) | select (int(_1)+int(_2))*int(_9) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| | ``%`` modulo | select count(*) from s3object where cast(_1 as int)%2 == 0; |
|
||||
| | ``%`` modulo | select count(*) from s3object where cast(_1 as int)%2 = 0; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| | ``^`` power-of | select cast(2^10 as int) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Compare operators | > < >= <= == != | select _1,_2 from s3object where (int(_1)+int(_3))>int(_5); |
|
||||
| Compare operators | > < >= <= = != | select _1,_2 from s3object where (int(_1)+int(_3))>int(_5); |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | AND OR NOT | select count(*) from s3object where not (int(1)>123 and int(_5)<200); |
|
||||
| logical operator | AND OR NOT | select count(*) from s3object where not (int(_1)>123 and int(_5)<200);|
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | is null | return true/false for null indication in expression |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
@ -83,42 +83,74 @@ Features Support
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator and NULL | unknown state | review null-handle_ observe how logical operator result with null. |
|
||||
| | | the following query return **0**. |
|
||||
| | | ``select count(*) from s3object where null and (3>2);`` |
|
||||
| | | |
|
||||
| | | select count(*) from s3object where null and (3>2); |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Arithmetic operator with NULL | unknown state | review null-handle_ observe the results of binary operations with NULL|
|
||||
| | | the following query return **0**. |
|
||||
| | | ``select count(*) from s3object where (null+1) and (3>2);`` |
|
||||
| | | |
|
||||
| | | select count(*) from s3object where (null+1) and (3>2); |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| compare with NULL | unknown state | review null-handle_ observe results of compare operations with NULL |
|
||||
| | | the following query return **0**. |
|
||||
| | | ``select count(*) from s3object where (null*1.5) != 3;`` |
|
||||
| | | |
|
||||
| | | select count(*) from s3object where (null*1.5) != 3; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| missing column | unknown state | select count(*) from s3object where _1 is null; |
|
||||
| missing column | unknown state | select count(*) from s3object where _1 is null; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| query is filtering rows where predicate | select count(*) from s3object where (_1 > 12 and _2 = 0) is not null; |
|
||||
| is returning non null results. | |
|
||||
| this predicate will return null | |
|
||||
| upon _1 or _2 is null | |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| projection column | similar to | select case |
|
||||
| | if/then/else | when (1+1==(2+1)*3) then 'case_1' |
|
||||
| | | when ((4*3)==(12)) then 'case_2' else 'case_else' end, |
|
||||
| | | age*2 from s3object; |
|
||||
| | switch/case | cast(_1 as int) + 1 |
|
||||
| | default | when 2 then "a" |
|
||||
| | | when 3 then "b" |
|
||||
| | | else "c" end from s3object; |
|
||||
| | | |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``coalesce`` :: return first non-null argumnet |
|
||||
| | select coalesce(nullif(5,5),nullif(1,1.0),age+12) from s3object; |
|
||||
| projection column | similar to | select case |
|
||||
| | if/then/else | when (1+1=(2+1)*3) then 'case_1' |
|
||||
| | | when ((4*3)=(12)) then 'case_2' |
|
||||
| | | else 'case_else' end, |
|
||||
| | | age*2 from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``nullif`` :: return null in case both arguments are equal, or else the first one |
|
||||
| | nullif(1,1)=NULL nullif(null,1)=NULL nullif(2,1)=2 |
|
||||
| | select nullif(cast(_1 as int),cast(_2 as int)) from s3object; |
|
||||
| logical operator | ``coalesce {expression,expression ...} :: return first non-null argument`` |
|
||||
| | |
|
||||
| | select coalesce(nullif(5,5),nullif(1,1.0),age+12) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``{expression} in ( .. {expression} ..)`` |
|
||||
| | select count(*) from s3object |
|
||||
| | where 'ben' in (trim(_5),substring(_1,char_length(_1)-3,3),last_name); |
|
||||
| logical operator | ``nullif {expr1,expr2} ::return null in case both arguments are equal,`` |
|
||||
| | ``or else the first one`` |
|
||||
| | |
|
||||
| | select nullif(cast(_1 as int),cast(_2 as int)) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``{expression} in ( .. {expression} ..)`` |
|
||||
| | |
|
||||
| | select count(*) from s3object |
|
||||
| | where 'ben' in (trim(_5),substring(_1,char_length(_1)-3,3),last_name); |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``{expression} between {expression} and {expression}`` |
|
||||
| | select count(*) from stdin |
|
||||
| | where substring(_3,char_length(_3),1) between "x" and trim(_1) |
|
||||
| | and substring(_3,char_length(_3)-1,1) == ":"; |
|
||||
| | |
|
||||
| | select count(*) from s3object |
|
||||
| | where substring(_3,char_length(_3),1) between "x" and trim(_1) |
|
||||
| | and substring(_3,char_length(_3)-1,1) = ":"; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| logical operator | ``{expression} like {match-pattern}`` |
|
||||
| | select count(*) from s3object where first_name like '%de_'; |
|
||||
| | select count(*) from s3object where _1 like \"%a[r-s]\; |
|
||||
| | |
|
||||
| | select count(*) from s3object where first_name like '%de_'; |
|
||||
| | |
|
||||
| | select count(*) from s3object where _1 like \"%a[r-s]\; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| | ``{expression} like {match-pattern} escape {char}`` |
|
||||
| | |
|
||||
| logical operator | select count(*) from s3object where "jok_ai" like "%#_ai" escape "#"; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| true / false | select (cast(_1 as int)>123 = true) from s3object |
|
||||
| predicate as a projection | where address like '%new-york%'; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| an alias to | select (_1 like "_3_") as *likealias*,_1 from s3object |
|
||||
| predicate as a prjection | where *likealias* = true and cast(_1 as int) between 800 and 900; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| casting operator | select cast(123 as int)%2 from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
@ -126,6 +158,8 @@ Features Support
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| casting operator | select cast('ABC0-9' as string),cast(substr('ab12cd',3,2) as int)*4 from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| casting operator | select cast(5 as bool) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| casting operator | select cast(substring('publish on 2007-01-01',12,10) as timestamp) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| non AWS casting operator | select int(_1),int( 1.2 + 3.4) from s3object; |
|
||||
@ -142,24 +176,36 @@ Features Support
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Aggregation Function | max | select max(float(_1)),min(int(_5)) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Aggregation Function | count | select count(*) from s3object where (int(1)+int(_3))>int(_5); |
|
||||
| Aggregation Function | count | select count(*) from s3object where (int(_1)+int(_3))>int(_5); |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Timestamp Functions | extract | select count(*) from s3object where |
|
||||
| | | extract('year',timestamp(_2)) > 1950 |
|
||||
| | | and extract('year',timestamp(_1)) < 1960; |
|
||||
| | | extract(year,to_timestamp(_2)) > 1950 |
|
||||
| | | and extract(year,to_timestamp(_1)) < 1960; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Timestamp Functions | dateadd | select count(0) from s3object where |
|
||||
| | | datediff('year',timestamp(_1),dateadd('day',366,timestamp(_1))) == 1; |
|
||||
| | | datediff(year,to_timestamp(_1),dateadd(day,366,to_timestamp(_1))) = 1;|
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Timestamp Functions | datediff | select count(0) from s3object where |
|
||||
| | | datediff('month',timestamp(_1),timestamp(_2))) == 2; |
|
||||
| | | datediff(month,to_timestamp(_1),to_timestamp(_2))) = 2; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Timestamp Functions | utcnow | select count(0) from s3object where |
|
||||
| | | datediff('hours',utcnow(),dateadd('day',1,utcnow())) == 24 ; |
|
||||
| | | datediff(hours,utcnow(),dateadd(day,1,utcnow())) = 24; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Timestamp Functions | to_string | select to_string( |
|
||||
| | | to_timestamp("2009-09-17T17:56:06.234567Z"), |
|
||||
| | | "yyyyMMdd-H:m:s") from s3object; |
|
||||
| | | |
|
||||
| | | ``result: "20090917-17:56:6"`` |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| String Functions | substring | select count(0) from s3object where |
|
||||
| | | int(substring(_1,1,4))>1950 and int(substring(_1,1,4))<1960; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| substring with ``from`` negative number is valid | select substring("123456789" from -4) from s3object; |
|
||||
| considered as first | |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| substring with ``from`` zero ``for`` out-of-bound | select substring("123456789" from 0 for 100) from s3object; |
|
||||
| number is valid just as (first,last) | |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| String Functions | trim | select trim(' foobar ') from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| String Functions | trim | select trim(trailing from ' foobar ') from s3object; |
|
||||
@ -170,14 +216,13 @@ Features Support
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| String Functions | lower/upper | select lower('ABcD12#$e') from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| String Functions | char_length | select count(*) from s3object where char_length(_3)==3; |
|
||||
| String Functions | char_length | select count(*) from s3object where char_length(_3)=3; |
|
||||
| | character_length| |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| Complex queries | select sum(cast(_1 as int)), |
|
||||
| | max(cast(_3 as int)), |
|
||||
| | substring('abcdefghijklm', |
|
||||
| | (2-1)*3+sum(cast(_1 as int))/sum(cast(_1 as int))+1, |
|
||||
| | (count() + count(0))/count(0)) from s3object; |
|
||||
| Complex queries | select sum(cast(_1 as int)), |
|
||||
| | max(cast(_3 as int)), |
|
||||
| | substring('abcdefghijklm',(2-1)*3+sum(cast(_1 as int))/sum(cast(_1 as int))+1, |
|
||||
| | (count() + count(0))/count(0)) from s3object; |
|
||||
+---------------------------------+-----------------+-----------------------------------------------------------------------+
|
||||
| alias support | | select int(_1) as a1, int(_2) as a2 , (a1+a2) as a3 |
|
||||
| | | from s3object where a3>100 and a3<300; |
|
||||
@ -215,20 +260,94 @@ s3-select function interfaces
|
||||
|
||||
Timestamp functions
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
| The `timestamp functionalities <https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-date.html>`_ is partially implemented.
|
||||
| the casting operator( ``timestamp( string )`` ), converts string to timestamp basic type.
|
||||
| Currently it can convert the following pattern ``yyyy:mm:dd hh:mi:dd``
|
||||
| The timestamp functionalities as described in `AWS-specs <https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-date.html>`_ is fully implemented.
|
||||
| the casting operator( ``to_timestamp( string )`` ), converts string to timestamp basic type.
|
||||
| to_timestamp operator able to convert the following ``YYYY-MM-DD-HH-MI-SS.sssss(micro secs)+-HH:MI`` string format into timestamp
|
||||
| upon time(or part of it) is missing, zero's are replacing the missing parts.
|
||||
| +/- HH:MI is also optional, in the case it's missing, it's considere as 00:00.
|
||||
| values between -12:00 and 14:00 set the timezone of the timestamp value.
|
||||
|
||||
| ``extract( date-part , timestamp)`` : function return integer according to date-part extract from input timestamp.
|
||||
| supported date-part : year,month,week,day.
|
||||
| ``extract(date-part, timestamp)`` : function return integer according to date-part extract from input timestamp.
|
||||
| supported date-part : year,month,week,day,hour,minute,second,timezone_hour,timezone_minute.
|
||||
|
||||
| ``dateadd(date-part , integer,timestamp)`` : function return timestamp, a calculation results of input timestamp and date-part.
|
||||
| supported data-part : year,month,day.
|
||||
| ``dateadd(date-part, integer, timestamp)`` : function returns timestamp, a calculation results of input timestamp and date-part.
|
||||
| supported data-part : year,month,day,hour,minute,second.
|
||||
| dateadd operator includes timezone in calculation.
|
||||
|
||||
| ``datediff(date-part,timestamp,timestamp)`` : function return an integer, a calculated result for difference between 2 timestamps according to date-part.
|
||||
| supported date-part : year,month,day,hours.
|
||||
| ``datediff(date-part, timestamp, timestamp)`` : function returns an integer, a calculated result for difference between 2 timestamps according to date-part.
|
||||
| supported date-part : year,month,day,hour,minute,second.
|
||||
| datediff operator includes timezone in calculation.
|
||||
|
||||
| ``utcnow()`` : return timestamp of current time.
|
||||
| ``to_string(timestamp format_pattern)`` : formating timestamp
|
||||
|
||||
to_string parameters
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| Format | Example | Description |
|
||||
+==============+=================+===================================================================================+
|
||||
| yy | 69 | 2-digit year |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| y | 1969 | 4-digit year |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| yyyy | 1969 | Zero-padded 4-digit year |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| M | 1 | Month of year |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| MM | 01 | Zero-padded month of year |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| MMM | Jan | Abbreviated month year name |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| MMMM | January | Full month of year name |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| MMMMM | J | Month of year first letter (NOTE: not valid for use with to_timestamp function) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| d | 2 | Day of month (1-31) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| dd | 02 | Zero-padded day of month (01-31) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| a | AM | AM or PM of day |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| h | 3 | Hour of day (1-12) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| hh | 03 | Zero-padded hour of day (01-12) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| H | 3 | Hour of day (0-23) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| HH | 03 | Zero-padded hour of day (00-23) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| m | 4 | Minute of hour (0-59) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| mm | 04 | Zero-padded minute of hour (00-59) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| s | 5 | Second of minute (0-59) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| ss | 05 | Zero-padded second of minute (00-59) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| S | 0 | Fraction of second (precision: 0.1, range: 0.0-0.9) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| SS | 6 | Fraction of second (precision: 0.01, range: 0.0-0.99) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| SSS | 60 | Fraction of second (precision: 0.001, range: 0.0-0.999) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| SSSSSS | 60000000 | Fraction of second (maximum precision: 1 nanosecond, range: 0.0-0999999999) |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| n | 60000000 | Nano of second |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| X | +07 or Z | Offset in hours or "Z" if the offset is 0 |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| XX or XXXX| +0700 or Z | Offset in hours and minutes or "Z" if the offset is 0 |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| XXX or XXXXX | +07:00 or Z | Offset in hours and minutes or "Z" if the offset is 0 |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| X | 7 | Offset in hours |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| xx or xxxx | 700 | Offset in hours and minutes |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
| xxx or xxxxx | +07:00 | Offset in hours and minutes |
|
||||
+--------------+-----------------+-----------------------------------------------------------------------------------+
|
||||
|
||||
|
||||
Aggregation functions
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
@ -246,11 +365,15 @@ Aggregation functions
|
||||
String functions
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
| ``substring(string,from,to)`` : return a string extract from input string according to from,to inputs.
|
||||
| ``substring(string,from,to)`` : substring( string ``from`` start [ ``for`` length ] )
|
||||
| return a string extract from input string according to from,to inputs.
|
||||
| ``substring(string from )``
|
||||
| ``substring(string from for)``
|
||||
|
||||
| ``char_length`` : return a number of characters in string (``character_length`` does the same).
|
||||
|
||||
| ``trim`` : trims leading/trailing characters from target string, the default is blank character.
|
||||
| ``trim`` : trim ( [[``leading`` | ``trailing`` | ``both`` remove_chars] ``from``] string )
|
||||
| trims leading/trailing(or both) characters from target string, the default is blank character.
|
||||
|
||||
| ``upper\lower`` : converts characters into lowercase/uppercase.
|
||||
|
||||
@ -267,6 +390,27 @@ Alias
|
||||
|
||||
| Of Course, per each new row the cache is invalidated.
|
||||
|
||||
Testing
|
||||
~~~~~~~
|
||||
|
||||
| s3select contains several testing frameworks which provide a large coverage for its functionalities.
|
||||
|
||||
| (1) tests comparison against trusted engine, meaning, C/C++ compiler is a trusted expression evaluator,
|
||||
| since the syntax for arithmetical and logical expressions are identical (s3select compare to C)
|
||||
| the framework runs equal expressions and validates their results.
|
||||
| A dedicated expression generator produces different sets of expressions per each new test session.
|
||||
|
||||
| (2) compare results of queries whose syntax is different but semantically they are equal.
|
||||
| this kind of test validates that different runtime flows produce identical result,
|
||||
| on each run with different dataset(random).
|
||||
|
||||
| For one example, on a dataset which contains a random numbers(1-1000)
|
||||
| the following queries will produce identical results.
|
||||
| ``select count(*) from s3object where char_length(_3)=3;``
|
||||
| ``select count(*) from s3object where cast(_3 as int)>99 and cast(_3 as int)<1000;``
|
||||
|
||||
| (3) constant dataset, the conventional way of testing. A query is processing a constant dataset, its result is validated against constant results.
|
||||
|
||||
Sending Query to RGW
|
||||
--------------------
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
tasks:
|
||||
- s3tests:
|
||||
client.0:
|
||||
force-branch: ceph-master
|
||||
force-branch: tethology_s3select_apr19
|
||||
rgw_server: client.0
|
||||
|
@ -1 +1 @@
|
||||
Subproject commit 1ea37ae459f957786b9d198510637dfa12f83350
|
||||
Subproject commit 63129ea4d2777204d0ddc2786c11062b6884a88b
|
Loading…
Reference in New Issue
Block a user