Discussion:
Multiplying matches, complex query
(too old to reply)
m***@gmail.com
2007-03-30 17:34:13 UTC
Permalink
Hi all,

I need to think of a complex way of searching data in a table and also
create a scoring system and display results considering their score.
I need to keep some tokens and also need to compute the score with
those values. The formula for computing the score might be different.
(For example: Token1^4 + Token2 * 7 .....). I'll explain further what
those tokens mean and why do i need them.
So it is useful l to know that score can be anything (useful for later
query upgrades, adapted to customer's requirements). It is like a
function - score(Token1, ... ,TokenN)

Suppose i have the following sql table:
my_table (col_1, col_2, col_3, col_4)

A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will
be 2 points and the last one 1 point. This is how i compute each
token. Then each token will help me compute a score and finally
display the results.

To have an idea of what i mean, here is some user input:
"col1:a col2:b or c"
col1:a - search only column1 for values like a
c - search all columns for values like c

This is what i have:

select *
, case when `col1` LIKE '%a%' then 4 else 0 end
Token0
, case when `col2` LIKE '%b%' then 3 else 0 end
Token1
, case when `col1` LIKE '%c.%' then 4 else 0 end
+ case when `col2` LIKE '%c.%' then 3 else 0 end
+ case when `col3` LIKE '%c.%' then 2 else 0 end
+ case when `col4` LIKE '%c.%' then 1 else 0 end
Token2
, case Token0 * (Token1 + Token2)

score from `my_table` HAVING score > 0
order by score DESC

As you can see "col1:a col2:b or c" will produce a score: Token0 *
(Token1 + Token2)
because `and` becomes a `*` and `or` becomes a `+`. But the score
formula might be different on different. This must be flexible to
allow upgrades so I need those tokens.

This is the error i get:
#1054 - Unknown column 'Token0' in 'field list'

Best regards,
Marius.
strawberry
2007-03-31 10:02:37 UTC
Permalink
Post by m***@gmail.com
Hi all,
I need to think of a complex way of searching data in a table and also
create a scoring system and display results considering their score.
I need to keep some tokens and also need to compute the score with
those values. The formula for computing the score might be different.
(For example: Token1^4 + Token2 * 7 .....). I'll explain further what
those tokens mean and why do i need them.
So it is useful l to know that score can be anything (useful for later
query upgrades, adapted to customer's requirements). It is like a
function - score(Token1, ... ,TokenN)
my_table (col_1, col_2, col_3, col_4)
A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will
be 2 points and the last one 1 point. This is how i compute each
token. Then each token will help me compute a score and finally
display the results.
"col1:a col2:b or c"
col1:a - search only column1 for values like a
c - search all columns for values like c
select *
, case when `col1` LIKE '%a%' then 4 else 0 end
Token0
, case when `col2` LIKE '%b%' then 3 else 0 end
Token1
, case when `col1` LIKE '%c.%' then 4 else 0 end
+ case when `col2` LIKE '%c.%' then 3 else 0 end
+ case when `col3` LIKE '%c.%' then 2 else 0 end
+ case when `col4` LIKE '%c.%' then 1 else 0 end
Token2
, case Token0 * (Token1 + Token2)
score from `my_table` HAVING score > 0
order by score DESC
As you can see "col1:a col2:b or c" will produce a score: Token0 *
(Token1 + Token2)
because `and` becomes a `*` and `or` becomes a `+`. But the score
formula might be different on different. This must be flexible to
allow upgrades so I need those tokens.
#1054 - Unknown column 'Token0' in 'field list'
Best regards,
Read up on normalization!!!!
m***@gmail.com
2007-04-03 10:17:15 UTC
Permalink
Post by strawberry
Read up on normalization!!!!
Hi i need more hints, how could this help me? Can i see a link
strawberry
2007-04-03 13:40:22 UTC
Permalink
Post by m***@gmail.com
Post by strawberry
Read up on normalization!!!!
Hi i need more hints, how could this help me? Can i see a link
It always triggers alarm bells when we see a table with several
columns, each holding the same kind of data!

I don't know what you're trying to do but typically a structure might
look like this:

token_clusters(token_cluster_id*,token_id*)

* = PRIMARY KEY

token_cluster_id | token_id
1 | 1
1 | 2
2 | 1
2 | 3
2 | 4
3 | 4
4 | 1

You might choose to store the corresponding mathematical operations in
a separate table:

operations(token_id,operation)

Now we can easily see all the operations required for a given
token_cluster (or 'row' in your non-normalized schema).
As I say, I don't know what you're trying to do, but something along
these lines is 'probably' a better way to go about doing it.

Continue reading on narkive:
Loading...