The Chicken Test is brought to you by W.R. and Bryce who live in Seattle. Bryce spends too much time on Twitter and posts photos to Flickr; sometimes you'll find him on Facebook. Look for both of them at BarCamps

Archive for March, 2010

Paula Thornton’s notes on #SharePointSearch from March 30, 2010

Paula Thornton (@rotkapchen) tweeted some great notes about Tagging and Taxonomies in SharePoint 2010 along with some valuable insight into deciding whether or not to pay for the FAST search license.

  • Materials being presented by Smartlogic, with product Semiphor — a taxonomy product integrated with FAST search.
  • Why are taxonomies needed? Terms have different meanings, is contextual. Expert vocabularies may not match search terms.
  • Terms may be in documents that are really irrelevant to the overall focus of the piece and would be meaningless results.
  • Taxonomy: Allows for capture of a domain knowledge, vocabulary and topical relationships. Ontology is multiple taxonomies.
  • Content as an asset should be organized/managed. Ontologies help with this (what’s missing, what are the relationships).
  • Ontologies can help support records management policies.
  • [Professional services firm Ascentium now presenting their experiences doing implementations w/2010 including inside MSoft
  • 2010 has a utility Term Store Manager for facilitating taxonomical activities.
  • Concerns: Rely on users for taxonomies? Preferrably not, but allow them to add their own folksonomies.
  • Concerns: If you build a taxonomy, many companies are not willing to add staff to maintain and they must be maintained.
  • The manager is extensible, but there are challenges (running into while inside Microsoft via 'eat own dogfood' effort).
  • One primary goal: improve findability. Architecture of Term Store: Group, Term Set, Term
  • [search for blogs that deal with Term Store Manager -- they offer great advice]
  • [BTW Term Store Manager is still in beta and not yet generally released. Great steps forward, but one step back.]
  • [Showing UI. Have created two groups: fruits, vegetables. Adding tomato to both. Can 'borrow' but also have multi-terms]
  • [Have talked to Microsoft about the issues with disambiguation when multiple meanings are presented for a term.]
  • Added Term Set “vine” to house “tomato”. But have different Term Sets of Vine under Fruit and Vegetables. Different GUIDs
  • [Issue: Right now anyone with access can change definitions. Governance models needed to manage for disambiguation.]
  • [I'm still trying to figure out why they're going to all this effort when FAST does most of this automatically.]
  • [I'm guessing this is still 'brute force' version of SharePoint, which still requires 'brute force' search management
  • [The dropdown lists can get unwieldy because there is no description in the dropdowns, no way to know which term is which]
  • [Back to Smartlogic staff] Small or Large Ontology? Small: Easier to maintain, buckets larger, results less granular.
  • Asking users to self-tag against a really large taxonomy requires considerable effort (requires understanding of whole)
  • Companies restructure, change clients, add lines of business — all effects the total taxonomy (and existing content base)
  • In 2007 there were no real taxonomies available. Keywords are not the same thing. Results can be tuned via taxonomy.
  • Smartlogic now demoing Semaphore. The taxonomy is only half the equation. The content itself must have relevant metadata.
  • [Semaphore is totally integrated as an add-on to the SharePoint UI, as if it were simply utilities in SharePoint.]
  • Adding a document to SharePoint brings up the Semaphore UI for taxonomical additions, via ‘assisted’ tagging.
  • That is, there are recommendations made to which edits can be made. Effectively ‘automated’ with overrides.
  • This UI is useful for existing SharePoint stores to be reviewed and classified as well. [Again, only relevant w/o FAST]
  • [With this level of effort, I'm still trying to figure out why a company wouldn't pay for the FAST licenses instead?]
  • [Interesting navigation of topic maps and managing the 'collection' from the whole rather than the discrete elements.]
  • Semaphore Architecture: Allows for imports of existing structures and reorganized. Text mining and classification.
  • Rules-based approach used for classification server to make recommendations (rules can be tweaked).
  • Now talking about the FAST Server and explaining the greater control over the results, multi-content results, etc.
  • May 11th there will be another event to cover the FAST Server in more detail.
  • Q&A raised issues of cross-cultural-language issues for global taxonomies. See Motorola study http://twurl.nl/f3crub
  • Bottom line, assuming that you can get meaningful search results out of the box from SharePoint is erroneous.