|
ASP Articles > Full Text Search
MS SQL Server 2005 full-text indexing
I am stingy, so instead of buying Visual Developer Studio 2005
and SQL Server, I made my database-driven website using the free
versions: Visual Web Developer Express and SQL
Server Express. However, I still wanted to be able to search
my database of product reviews using a full-text search, because
I was sure that Microsoft had put a lot of work into developing
this, and so it must be infinitely better than any crummy search
mechanism I could cook up myself. Here's how I did it, and what
I found...
Getting started with full-text search
- Make sure that you download the version of SQL Server Express
with the additional full-text search capabilities. This is the
larger one of the two downloads possible.
- Read up a bit on Full
Text Search concepts.
- Making the Catalog
Now you need to create a full-text catalog. The catalog
seems to be a container to put all your full-text indexes into.
So, if you are going to need to search the text of three tables,
you make three indexes (one for each table) and put all three
indexes in the one catalog.
WARNING: Some of the information of the Microsoft website relates
to the full version of SQL Server, such as the "Storage folder"
that Express doesn't have. Don't panic, you just need to create
the catalog by hand, rather than through the Management Interface.
For my application, I just made one catalog and put two of my
indexes into it. From what I understand, if your indexes are going
to be huge, you should make a separate catalog for each index.
Here is how to make a catalog:
- Start SQL Server Management Studio Express, and connect to
your database.
- Click on New Query to get a box where you can type raw SQL.
- Type:
use mydatabase
go
EXEC sp_fulltext_database 'enable'
go
CREATE FULLTEXT CATALOG mycatalog
go
- Click "Execute" to run the SQL. This will enable
fulltext searching on the database, and create the catalog,
which is in fact a folder on your computer called "mycatalog"
somewhere within the Microsoft SQL Server folder. If your catalog
is going to be HUGE, it may be worth researching some extra
parameters for this command to tell it where to put the catalog,
e.g. on a different disk to the database itself. This would
be for performance improvement more than anything else.
- Making the Index
Next we are going to create a full-text index. But if
you want to create a full-text index on a table, the table must
have a "unique, single-column, non-nullable index".
In most cases this will be your primary key. What Microsoft is
saying here is that you must have a non-null index on the table
(all primary keys automatically are this), and that the index
must be on just one column. Oh Drat! My table has a primary
key that is on two columns, so I have to make an index that is
on a single unique column in order to use full-text searching.
What to do, what to do? Well, my solution was to add a new column
to the table called "text_id". I made this an int
column, and also made it an identity column. This way, the text_id
field gets automatically filled with a unique integer for each
record that is entered, without me needing to change any of my
existing aspx pages.
Also, if you already have a primary key, but it is on a really
long field, I suggest you make an int field and use that
instead, because the full-text index that gets created will be
huge if the primary key you use is big, such as a GUID.
Ok, here is how to create the full-text index:
- If you need to add an identity column, do this first.
- Make sure you have an index on the identity column. I called
mine "myindex".
- Type in and execute this SQL:
CREATE FULLTEXT INDEX ON mydatabase.dbo.mytable
(
column_to_index
Language 0X0
)
KEY INDEX myindex ON mycatalog
WITH CHANGE_TRACKING AUTO
Where mytable is the name of the table, column_to_index
is the name of the column that is full of text you want to be
able to search on, myindex is the name of the index
(could be PK_something if you used your primary key), and mycatalog
is the name of the catalog you created earlier. The "Language"
bit just tells SQL Server to not treat this as being in any
specific language. I don't know how to specify another language
here, but you can change it later by using the SQL Server Manager
interface. I didn't notice any difference between a UK English
index and a language-neutral index. I suspect it is only important
for languages other than English.
- I did discover that sometimes when you try to make the index,
it doesn't get fully made until you do a little bit through
the user interface. To verify that things have worked, start
the SQL Server Manager interface, right-click on the table name
and choose "Modify". Right-click on the column and
choose "Full text index...". In here you need to check
that "Columns" is set to the column name you want
to index. Also check whether "Active" is Yes —
if not, it means the index isn't built yet.
- You may have to wait a while for the index to be created.
If you have a large table, the CPU usage on your PC will probably
go up to 100% while this happens. If you get problems, see the
troubleshooting section below...
- Performing a Find
The actual SELECT statement to find records is then very easy.
Here is an example:
SELECT description
FROM mytable
WHERE CONTAINS(description, '"shark attack"')
(You need to put the search terms in double-quotes, within the
single-quotes needed for the SQL)
- Eliminating Noise Words
If your text fields are in HTML, then you should add things lik
"BR" and "P" to the list of noise words.
What I discovered
At first, full-text search queries were taking about 30 seconds
to return results. This was pretty disappointing, and I started
looking for ways to tune the system up. But while I was looking,
the whole thing sorted itself out, and suddenly queries became almost
instant. It is very impressive, I can search for a phrase like "african
drumming" on my 20,000 record database almost instantly. I
don't know what caused this delay then speed-up. Presumably the
SQL Server was building the indexes, and it took a while. I'm not
sure how come I could do queries while the index was being built
though. It's a puzzle.
Also, I discovered that searching for "drum* lesson*"
returns better results than searching for "drum lesson",
because the first one will also find "drumming lessons".
So, I made it so that queries that people enter automatically have
a * put after each word.
A Bit of Troubleshooting
If it doesn't work, here are some of the problems I encountered,
and the ways of debugging and solving the problems:
Application Events
Right Click on My Computer, choose Manage, then Event Viewer, then
Application. This shows you the Application Event Log. In here you
will see any error messages from the Full Text service. Usually
one of these errors will just redirect you to the server logs themselves.
SQL Server Logfile & Fulltext Index Log-file
Have a look in the logs. I found mine in C:\Program Files\Microsoft
SQL Server\MSSQL.1\MSSQL\LOG. The fulltext logs are called SQLFTxxxxxx.LOG.
They are hard to understand, but they may help.
Try ReBooting
You need to reboot after you install FullText Indexing, and a reboot
is always a good place to start when troubleshooting. I had the
error message "Error '0x80040e09' occurred during full-text
index population" appearing in my logfiles, and no fulltext
indexes created at all until I figured out that I needed to reboot.
Commands & Queries that you can type into SQL Server 2005:
How to check the status of the full-text indexes in a catalog:
USE mydatabase;
GO
EXEC sp_help_fulltext_tables mycatalog;
GO
Enable fulltext search on a table:
USE mydatabase;
GO
ALTER FULLTEXT INDEX ON mytable ENABLE;
GO
Check if full text indexing is enabled:
USE mydatabase;
GO
SELECT DATABASEPROPERTY('mydatabase', 'IsFullTextEnabled');
GO
Update
I've been looking at this a bit more recently, and worked out a
few things:
- You can have more than one fulltext indexed column per table.
Old SQL Server versions prevented this, but 2005 allows it. So,
you can have a table with a title and a summary column, and both
can be full-text indexed.
- The language setting is important, despite what I wrote above.
I now use the code '2057' to set my indexes to be in UK English.
This means that searches on "dogs" will also find "dog".
- You can find the status of your fulltext indexes like this:
exec sp_help_fulltext_columns mytable
This shows you which columns in a table are set up for full-text
searching.
select * from sys.fulltext_indexes
This shows you the status of the current indexes and when they
were last updated. I think the key is to make sure that change_tracking_state_desc
is set to "AUTO" and that has_crawl_completed is "1".
You can also see the time of the last crawl, i.e. when the index
was last updated, which can help you see why things aren't working.
Here is my example .sql statement, that sets up three columns of
a table for fulltext indexing, and sorts out the change tracking
state, and starts a full crawl:
use content
exec sp_fulltext_database 'enable'
go
exec sp_fulltext_catalog 'cmscatalog', 'create'
go
exec sp_fulltext_table 'document', 'create', 'cmscatalog',
'PK_document'
go
exec sp_fulltext_column 'document', 'title', 'add',
'2057'
exec sp_fulltext_column 'document', 'keywords',
'add', '2057'
exec sp_fulltext_column 'document', 'search_text',
'add', '2057'
go
exec sp_fulltext_table 'document', 'activate'
go
EXEC sp_fulltext_table 'document', 'Start_background_updateindex';
go
exec sp_fulltext_table 'document', 'start_full'
go
Conclusion
Full Text Searching is easy to set up and use in SQL Server Express.
When it works, it works fast!
Page updated 31 Jan 07
|