Thursday, August 5, 2010

Solr DataImportHandler and OData Service Issue

This week I was working on a proof of concept for indexing data from a SQL database with Solr. I had done something similar with xml files using some .NET code to read the xml files into a document class that I then inserted into Solr using the excellent SolrNet library. However, for this, I wanted to try out the Solr DataImportHandler (DIH), since this was a pretty straight forward table to index mapping. So I read up on using the UrlDataSource Import Handler in Solr and set about creating my OData Service to expose the data. I was able to create my OData service pretty quickly and then I setup the DIH following the steps at SolrWiki. However, when I attempted to call the OData Service from the DIH, it kept generating an error accessing the OData url. I was getting an HTTP 400 error. This was really strange, because I was able to access the OData Service without any issues from the browser. It was only when I use the DIH that I had this problem. With some help from another colleague, it was determined that the issue was in the http accept headers that were being sent by the DIH call to the OData service. The server was generating the error “Media type requires a ‘/’ character”

The Accept Header being passed is:

text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2

The issue is last two media types “*; q=.2, */*; q=.2” - the OData services do not recognize the relative quality factor as found the Http Header Field Definitions in RFC 2616 – Section 14. I even tried connecting to the OData Northwind Sample Service, but had the same issue.

So in order to work around this issue, I created the following HttpModule to workaround the issue by removing these two bad entries.

using System;
using System.Collections.Generic;
using System.Web;


namespace SolrIISModules
{
public class AcceptFilter : IHttpModule
{
private const string ACCEPTHEADER = "HTTP_ACCEPT";
private const string BADSOLRACCEPT = "*/*; q=.2";

public void Init(HttpApplication context)
{
context.PreRequestHandlerExecute += context_PreRequestHandlerExecute;
}

private static void context_PreRequestHandlerExecute(object sender, EventArgs e)
{
var application = sender as HttpApplication;
if (application == null) return;

var request = application.Context.Request;
if (string.IsNullOrEmpty(request[ACCEPTHEADER])) return;
var acceptValues = request[ACCEPTHEADER].Split(',');
var filteredValues = new List<string>();
foreach (var value in acceptValues)
{
if (!value.Contains("/")) continue;
if (IsBadSolrValue(value))
{
//only add the */* back in, otherwise an http 415 error will be generated.
filteredValues.Add("*/*");
continue;
}
filteredValues.Add(value);
}

application.Context.Request.Headers.Set("accept", String.Join(",", filteredValues.ToArray()));
}

private static bool IsBadSolrValue(string acceptValue)
{
return string.Compare(acceptValue.Trim(), BADSOLRACCEPT, true) == 0;
}

public void Dispose()
{
}
}
}




Now I just enable this module on my IIS website that is hosting the OData Service and I am able to get my DIH working as expected. I had tested this with an OData service hosted in both ASP.NET Webforms and ASP.NET MVC websites.