Welcome! Log In Create A New Profile

Advanced

WD25: SQL join appears to match different non-alpha-numeric characters

Posted by Mike James 
WD25: SQL join appears to match different non-alpha-numeric characters
October 13, 2020 07:39AM
Hi, a join on two fields, which have slightly different content, produces a match:


1: RB3548N 002/58 from an fic table 30 bytes)
2: RB3548N 002-58 from a CSV

There is a space in the middle of each string.

I'd be very grateful for any comments.

Regards
Mike
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 13, 2020 09:02AM
Hi Mike

Depends on how you defined the index I believe.
If you have used an index that doesn't take special characters and spaces into account (/ and - in this case), they are identical.

Cheers

Peter



Edited 1 time(s). Last edit at 10/13/2020 09:52AM by PeHoBe.
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 15, 2020 12:20AM
Hi Peter

Thanks for your suggestion. The table has character set "Roman (characters) in ANSI standard" which I believe is the default. Space, dash and obliques are all normal ASCII characters.

Removing the spaces (in the JOIN statement) resolves the issue with this query, ie: - replace(<fic table>.prdcode,' ','') = replace(T1.Item_Number,' ','') (T1 is data from the CSV file read into another .fic table). My query no longer matches these two values.

Regards
Mike



Edited 1 time(s). Last edit at 10/15/2020 12:24AM by Mike James.
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 15, 2020 04:36AM
Hi Peter

Researched your suggestion further. Found that (in the analysis) the index option you referred to for that field was not enabled, nor for any text fields.Thank you for pointing me at it.

Do you generally enable that for all text fields, or selectively considering what data they might hold?

Meantime I am using the more general join condition on wl.stringtoutf8(<fic>.prdcode) = wl.stringtoutf8(T1.Item_Number), which will compare exactly every time.
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 15, 2020 08:42AM
Hi Mike

Why would you still want a database in ANSI format in 2020?
It seems your Project configuration is Unicode but your database is ANSI.

In the analysis, there is an option to automatically set all strings to Unicode.
If you're using HyperFile, the regenaration it will automatically update your database too.

It will prevent you form doing all this in-line conversion WL.StringTo... and specific declarations (is ANSI string, is UNICODE string).
You'll see it will make your life so much easier...

Cheers

Peter
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 15, 2020 09:41PM
Thanks Peter, I will discuss with my colleague who develops the application.

Is there any performance hit on indices by doing this?
strings are twice as long in unicode.... processing strings twice as long takes longer... Sp yes, there is a price to pay, as always
Re: WD25: SQL join appears to match different non-alpha-numeric characters
October 20, 2020 05:13AM
Thanks Argus

We set the particular field to unicode which resolved our problem. Still considering whether to apply this across the board.

Regards
Mike
to answer a previous question you asked... You should set indexes on column on which you are searching, either via queries or hreadseeks, in order to GREATLY increase the speed. So what is in the WHERE conditions of your queries is what matters.
Author:

Your Email:


Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: