One of our big decisions on the proposed physical architecture of our sharepoint farms has been can we go 64bit on the index server. For us the ability to index .msg files is crucial and until now I thought I would have to purchase a third party ifilter. The problem being that the third party does not have a 64bit version of their ifilter and initial discussion with the vendor indicated that building a 64bit ifilter would not be easy.
Some recent updates to the technet sharepoint doco were pointed out to me yesterday that listed msg indexing support as being available out of the box.
This is the relevant page:-
File types and IFilter reference (Office SharePoint Server)
URL: http://technet2.microsoft.com/Office/en-us/library/09357d8e-37b9-4e96-b8fd-f17b990d010a1033.mspx
So this doc basically states that msg filters are inbuilt. So I do a search of some content that is in the body of a .msg file and no results are returned.
Now I’ve been down this path before with the foxit pdf ifilters so I’m pretty sure that it is a problem with missing values in the registry. I also note at the bottom of the above reference page is a link to an article on search onenote 2007 files. This document provides hints as to the missing registry keys.
Install and register the OneNote IFilter (Office SharePoint Server 2007)
URL: http://technet2.microsoft.com/Office/en-us/library/2e715e42-c09b-4b4f-a082-b19e1cad96031033.mspx
Crank up regedit check that the :-
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.msg keys exist.
(They should do, all my installs have this by default)
The following key is the one that is missing.
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\Filters\.msg]
“Extension”=”msg”
“FileTypeBucket”=dword:00000001
“MimeTypes”=”application/msoutlook”
[Edit 08JUL2010 - changed MimeTypes key above as per testing from others. Thanks to devparts comment below]
Now, I have taken a guess on the mimetype for outlook. I can find no mimetypes for .msg files in the registries on my servers and my workstations. A web search yielded 3 possibilities, “application/vnd.ms-outlook”, “application/x-msg” & “application/msoutlook”
So the next thing was to restart the office search service, and iisreset. I think reset all crawled content (purge the index). Then I ran a full crawl.
Fortunately I got lucky on the mimetype and the index successfully crawled the content of the email! Now for the caveats. I’ve only done basic testing. I haven’t checked if the msg ifilter will pull an attachment (eg word doc) and index that as well. Also I haven’t checked this with Microsoft, though I think it is worthy of a bug report (at least in the doco).
Also thanks to Mutaz from MCS for his initial guidance on pointing me in the right direction this.
