Apr 27, 2013

Barista Search Index Q&A


Q&A:

Q: What powers the search index under the covers?

A: The Barista Search Index is powered by Lucene.Net 3.0.3 – Available on NuGet. The Barista Search Index hosts the Lucene Search Engine in a separate SharePoint managed windows service. This allows the service to operate independently of app pool recycles and allows for the index to cleanly shut down.

Q: Who else uses Lucene? Is it mature?

A number of sites use lucene. See http://wiki.apache.org/lucene-java/PoweredBy.

The site StackOverflow uses lucene to power its search. Sites like Twitter Analytics use it as well.

Lucene has been around since 1999, I would venture that it’s pretty darn mature. Lucene and Lucene.Net are both active Apache sponsored open-source projects.

Q: Why would I use the Barista Search Index over FAST Search or SharePoint Search?

A: The main reason for using the Barista Search Index is to provide an application-level search functionality that would be otherwise difficult or administratively burdensome to setup. Using FAST or SharePoint search to power an application-level search has the issue of having to wait until the next crawl for new content to be available – even with the continuous crawl option in SharePoint 2013, new content may take 5-10 minutes to appear depending on the load of the crawler. The FAST search index used to have a capability through the Content API to insert documents directly into the FAST search index, however, this capability is deprecated – I’m not aware of a Content API for SharePoint Search in SharePoint 2013. Inserting documents into FAST via SQL is traditionally a burdensome process.

You can get going extremely quickly with the Barista Search Index without making farm-wide configuration changes, requiring FAST and so forth.

For searching above the scope of an application, such as portal wide search and searching within documents, using FAST or SharePoint search is still recommended.

Q: I have a lot of data I potentially wish to index, can I hit the index directly without going through a Barista Service?

A: Yes. The Barista Search Index exposes a WCF endpoint that allows data to be indexed and search results to be retrieved through a WCF client. The endpoint is available at http://appservername:8500/Barista/Search/mex. This endpoint should be available to any SOAP client and is strongly typed. If the Administrator has changed the endpoint address, you’ll need to update accordingly.

Q: Is there a service locator for the Barista Index Service?

A: Yes. By referencing and using the SPBaristaSearchServiceProxy class. This class has behavior to query and locate and use the Barista Search Index within the SharePoint farm and configure the client to the endpoint address.

Q: Will you give me cookies?

A: No. Unless you’re blue and furry.

Read More
Apr 26, 2013

Introducing the Barista Search Index


Say you have a set of data you would like to be able to easily query – the data may be a aggregation of a number of list items from separate lists, sets of data stored in a Barista document store, information stored in XML files and not otherwise promoted to fields of list items – for instance, infopath forms stored in SharePoint, data gathered from the web via Ajax calls, data from SQL server surfaced via the SQL Data bundle and so forth.

The Barista Search Index bundle allows loosely typed content to be stored and, later, queried upon. Note that the search index should not be used as a database – in the architecture of your data storage solution, your persistent data store should be separate from the search index so that you can re-create your index from your persistent store. Also, the Barista Search Index is an Information Retrieval system - -this means it has slightly different semantics than you would expect when working with a traditional database. Further, since the Barista Search Index is loosely typed, it works with any JSON document without needing to define a schema first – as long as you have a JSON document and you specify an id field, that JSON object can be added to the index. This is good since most of the functionality in Barista (including the retrieval of list items, and the ability to read and parse XML documents and Feeds) works with JSON objects. Note that in its current incarnation, there isn’t a way to index binary document formats such as DOCx, PDF, XLSx and so on – future capability might be added to support this through IFilters, but for now, full-text indexing of document formats is out of scope.

To get started with the Barista Search Index bundle, you’ll need to set up a search index directory. A directory is a location where the actual files that store the index are kept. Setting up an index is done through the Barista Service Application management page in Central Admin.

In the following screenshot, I have a number of indexes set up, mostly for test purposes. You’ll notice that there’s three types of indexes to choose from, RAM Directory, which is a volatile, but fast index, File Directory, which stores files on the file system or a network share, and the SharePoint Directory which stores files within SharePoint. The SharePoint directory is the slowest of the three – all that SQL I/O doesn’t come cheap.

Read More
Apr 25, 2013

SP JSOM vs Barista


Recently it came up on how to retrieve the urls to pictures in a pictures library.

In Barista, that code looks like the following.

Read More
Apr 20, 2013

Barista Improvements


Had some time to enact some additions and fixes that have been piling up over the week:

Deployment

  • The deployment scripts have been updated and are better. There is now a single deployment script/bat file that can be called via Deploy.bat and it’ll do everything that’s needed to install Barista while you wait.
    • The deployment scripts will attempt to stop anything that might be referencing the barista.core.dll/barista.sharepoint.core.dll/barista.sharepoint.dlls to minimize locked assemblies in order to minimize potential orphaning – this doesn’t mean that locked assembiles won’t still occur however…
    • The deployment scripts now deploy the Barista Search Service and the Barista Web Socket Service.
  • The Visual Studio SharePoint solution deployment functionality (Right-Click Deploy) has been updated to call all the scripts as if the scripts themselves were run.
    • here was some flakiness with how CKSDev was calling the PS1 scripts that was a load of frustration, I’ve taken these out and I’m just calling “master” scripts in the pre/post deployment steps of the SharePoint project tab. In fact, CKSDev isn’t a dependency anymore (although it’s still nice for right-click attach to w3wp… get the CKSDev 2012 version for a slimmed down CKSDev)
    • You can right-click and select “Deploy” on Barista.SharePoint and it’ll do as you expect, deploy… with none of the previous flakiness.
Read More
Jan 23, 2013

Parsing CSV to JSON and back again


The Document bundle now has routines to transform CSV to Json and back again.

So, given this Barista service:

Read More