Posts Tagged ‘blog search’

In Social Media Monitoring, Focus Pays Big Dividends

Friday, January 15th, 2010

focusThe volume of social media content can be overwhelming, even with the best designed search strategy. And even then, there is the always ubiquitous spam problem…

Because of these challenges, focus is key in navigating your way through the social web. How do I focus my search to make sure that my results stay relevant? One of my favorite features in the Community Insights platform is called “Community Lists”.  This feature allows users to create discrete, spam free source lists, even across multiple content types (blogs, micro-blogs, forums, online news, etc.).  I use this feature to segment coverage so I can focus on tracking high impact industry influencers, another list of bloggers I want to engage with regularly and even internal sources so I can monitor our departmental activity levels.   In addition, there are many other possible use cases.

Clearly, there are times when I want to search across the full blogosphere, for example when using our discovery feature to surface emerging topics.  But day-to-day, using Community Lists saves me a great deal of time and increases precision of my analysis.

To find out how Biz360 can power your insights, visit us here, or get started here. Thanks for visiting!

Developing Good Topic Searches

Thursday, January 7th, 2010

I often get asked for help with topics – I guess that’s because I have created tens of thousands of topics over the years. For those of you who are new to our Community Insights platform, topics are saved searches. I have compiled a set of general tips on creating topics in the CI application, but these could apply to searching the internet in general.

  • While it’s well worth your time to plan the search strategy before actually creating a topic, it’s best not to spend too much time trying to arrive at a search that will give you perfect results the first time around, especially if it’s a concept that you have never searched on before. Start by keeping it general, and then narrow it down for more specific results. (Yes! creating topic search strategy is a fun creative process).
  • It’s a misconception that you need to list every possible search term that might yield relevant results. Usually a few good search terms will yield 80% of the results. So rather than trying to find an exhaustive list of search terms, your goal is to find the best search terms.
  • The easiest searches are those for brand names – names of companies and products. For example, if you search for Biz360 you can be pretty certain that most, if not all of the results returned will be about the company Biz360.
  • Searching for brand names is more difficult when the names are not unique.  For example, if you want results for the auto manufacturer Mini, you would have to use other qualifying terms along with Mini because the word on its own is not unique enough. So don’t run a search for just Mini. You would start by searching for Mini Cooper or Mini AND Car. Once you review a page full of results you will have a pretty good idea if you need to widen or narrow your search.
  • Planning for a topic involves identifying the main concepts that you want to search on. Make a list of terms for each concept. Also make a list of excludes.
  • Terms could be single terms like Mini or phrases  like Mini Cooper. I’ll illustrate this through an example. Let’s say I want to find conversations about fuel efficiency for the Mini – there are 2 concepts, the Mini and fuel efficiency.

Mini  : Mini Cooper, Mini AND car

Fuel Efficiency : fuel efficiency, fuel economy, fuel efficient, mileage, miles per gallon, mpg, fuel consumption

Excludes : netbook,  desktop,  Mac,  HP,  Apple,  Dell,  mini van

  • Run individual searches on each of the terms before running a combined search because it will help you to find:
    • The best search terms (in terms of number and quality of results)
    • Synonyms and other related terms
    • Irrelevant results for which you need to write excludes – here again don’t write excludes for every irrelevant post, only for terms that bring in several irrelevant results.
    • Terms that need additional qualifying terms to produce relevant results
  • Once you arrive at the list of the best terms, generalize and remove redundant terms.  For example you may not have to use all the above fuel related terms, just the term fuel may be sufficient.
  • If it is a simple concept you could use the Simple interface (both in Google or our Community Insights application). But if there are multiple concepts with complex relationships you will need to use the Advanced interface which allows you to use Boolean operators (AND, NOT and OR) and parentheses to connect the terms.
  • It’s always a good practice to preview search results before saving a topic to make sure you have no errors.
  • Though building topics is an iterative process, you also need to use judgment about when to stop tweaking a topic because it’s impossible to get only relevant results especially with social media content.
  • If you are a Community Insights user and require more specific topic creation syntax in the application, log onto our Support Portal http://ci.biz360.com/support/portal and run a search for Topics to access our Topic Creation videos.

Google’s Metrics are Meaningless

Thursday, October 29th, 2009

One question I often get asked is, “Why can’t I use Google Blog Search to track my coverage?”  There are a variety of reasons one would not want to do this, most importantly, Google’s metrics are meaningless.

To demonstrate the flaws in Google’s metrics, I decided check out the blog coverage from Google Wave.  Doing a quick search in Google Blog Search revealed about 1,569,236 results.  Was this a lot of conversation?  Looking over on the left, I saw the time frame was set to anytime.  Anytime is a little ambiguous so I narrowed it down to last week, and it returned about 15,419 results.  Using a separate browser, I ran the same query and it returned about 27,085 results.  That’s a difference of 11,666 results.  How could this be?  It was from the same machine, just different browsers (one being Safari, the other Firefox). In fact, each time I hit refresh the numbers changed.

Aside from different browsers getting different results, Google has another problem: the problem of counting.  Running a query for “Apple TV” for the date rage of 9/22-9/24 returned 1,526 results.  I wanted to know if there was a spike in conversation between these days so I ran the query once for each day.  The queries returned 162, 160, and 142 for the three days: 9/22, 9/23, and 9/24 respectively.  Adding those numbers returned 464 results.  The math didn’t make sense (464 does not equal 1,526).  As it turns out there is an explanation.

The number Google provides is only an approximation based on the probability of the the search terms occurrence in blogs.  Although I was not able to get an official word from Google (I’ll update the post if I hear back from them on the matter) there is a quote from an unnamed Google employee.  It’s old, but after testing the results, it seems they haven’t done much in this area.

There are small variations in the number of results due to the fact that index updates are done at different times in different data centers. But there are much larger variations due to the fact that these are all estimates, and we just haven’t tried that hard to make the estimates precise. To figure out the number of results in the query [a OR b], we need to intersect two posting lists. But we don’t want to pay the price of intersecting all the way to the end, so we do a prefix and then extrapolate. The extrapolation is done with the help of some parameters that were carefully tuned several years ago, but haven’t been reliably updated as the index has grown and the web has changed, so sometimes the results can be off.

Bottom line, Google’s search results are not meant to be used as an analytics platform.

Get Adobe Flash playerPlugin by wpburn.com wordpress themes