Dan Heller's Photography Business Blog Industry analysis from www.danheller.com

The photography world -- the business, the culture, the art, the politics, the technology.

Site Feed

Subscribe to
Posts [Atom]

My Photo
Name:
Location: Santa Cruz, California, United States
My Books on the
Photography Business

Sunday, November 29, 2009

Why there's no one-stop shop for photo buyers

I got an interesting email today from someone that prompted me to address a question on many people's minds: why hasn't a single website emerged as the "primary" place to license images? As those in the photo industry know, Getty sells quite a bit, and microstocks fall behind them, followed in turn by a smattering of pro photographers and others who do well as individuals. But, with the trillions of photos on the web, and with the sheer magnitude of opportunity, what's really the barrier to growth?

Here's an excerpt from his email:

do you think that there could be an issue with just straight up too much content for buyers (all buyers)? Say for example - there was a website that everyone knew was the place to buy and sell images for any sort of use (commercial, etc.)? Couldn't there still be too many shots of a 'dog'? ... if all the buyers/sellers universally knew this was the place for images - wouldn't the back-end functionality of a site like this take an army of programmers to design? I have to wonder why something like this doesn't already exist or is in development by a major player like google?


Saying that there's too many photos out there is like saying that Google can't index the web because there are too many sites. And thinking that there could be a single go-to site for buyers and sellers is like saying that there's could be single go-to site to buy electronics. There aren't because there's competition, etc. But, the reason why there's a viable, stable market for electronics (unlike photos) is because there are mechanisms in place that help establish price points, distributors, manufacturers, and so on. In short, it's a mature industry.

The same cannot be said of the photo industry for a variety of reasons.

To begin, it's not that there's "too much content" it's that there's no reliable mechanism for sifting through it. Go to images.google.com and type "two men shaking hands" -- a common image search for business purposes. Though the matches are generally accurate, the results are also entirely arbitrary. We have no idea if these are popular images, or they are shot by famous people, or if they are "current", or even whether the source (website) is ranked highly.

The same is true for every website that displays photos--buyer website or not. "Arbitrary results" is why buyers have trouble finding what they want quickly and easily. Yet, despite the huge number of sites that talk about "swine flu", a quick search on that topic usually gets you exactly what you want on the first page. You can even misspell it -- say, "swing flu" and still do well.

So the first problem that people have to solve is search. And this is irrespective of where people go. Now, people can argue about whether real buyers go to search engines or to stock agency sites, but the technology barrier exists nonetheless. Who's going to solve it? It cannot--by definition--be a stock agency. Why? Because they will not (and cannot) return results that are photos they cannot represent.

Whether now or in the future, search engines will be how most people (yes, buyers) find photos.

Now that's not to say that companies like Getty couldn't solve the problem. In fact, they should. To do so, however, they would change their business model from being an exclusive seller to one that acts as a proxy for others -- a grand middle-man, much like how Visa and Mastercard merely enable transactions with an infrastructure. They don't actually participate in the transaction itself.

The reason why Getty won't be coming to the table here is that they suffer from two basic errors in their understanding of the photo industry: first and foremost, they don't see the market outside of their existing world of traditional ad/media buyers. Yes, that's a big industry, and Getty services them adequately. But it's tiny in the broader world of image licensing transactions. And this leads to their second grave misunderstanding: the belief that the best way to service buyers is to have limited content that's hand-edited by seasoned photo editors.

This is not in keeping with the internet today. As we have all learned in the past decade, Web 2.0 means that the "crowd" is the editor, and order among the crowd is achieved by applying intelligent ranking algorithms, tracking their behaviors, and mining user preferences to glean predictability. Getty doesn't do this. No one does this. (Well, google does with their regular text search for non-image content.)

Getty's model of knowing and understanding what buyers want is fine if you have one-to-one relationships with them, and that's what Getty's good at. I'm not suggesting that Getty doesn't keep what they have in-house. It's what they don't have -- what needs to be built -- that they lack, and what the industry needs.

For the industry to become "mature", there must exist two main functions: (1) a search and ranking system that returns reliably accurate results beyond any measure we see today, and (2) a predictable and viable pricing model that represents a true market-maker commodity market.

Granted, these are not simple problems to solve, but there is precedent for similar algorithms. For instance, Google took quite a few years before they came up with just the right mix of variables and weightings to determine which web pages match a given user's search criteria. A similar list of factors can be derived to determine quality image searches in a variety of contexts.

In the photo world, everyone looks to metadata (such as keywords) as the primary factor, but that's quickly morphing into something else. It's a longer discussion to have, but it sums up this way: google stopped looking at web pages' "keywords" field in metadata because it became unreliable--the system can be gamed, and bad players were ruining the reliability of google results by lying about the content of their web pages. Google solved this problem by no longer looking at web pages' keywords settings, and did their own semantic analysis of web pages to determine what keywords should really be associated with them.

It's not that "keywords" themselves will go away in images, but the future of image search will go well beyond user-editable metadata. Factors like "age" (current-ness), supplier, number of views, frequency of (published) use, and automated programmatic analysis about images to assess various conceptual attributes as well. Hard? You betcha. Yet, just as it took google years to come up with their "hundreds of variables" that determine a web site's ranking for any given search term, so too will effort have to go into determining the relevancy of any given photo search.

There are various image recognition algorithms that can begin to move in this direction, but once you get into the science of it, the field is much broader than people think. There are proximity algorithms to determine if two images are the same, there are content algorithms to determine image characteristics (color, textures, emptiness, focus, etc.), and pattern recognition (faces, emotions, objects, and other patterns). Underneath every image recognition algorithm must be a hierarchical database of seed information to spin the world into motion so that trillions of images can be automatically processed from web crawlers continuously.

And this science isn't just to determine search relevancy: it would also be used by an auction-based system to set pricing for a trading system, exactly like how google prices keywords for its advertising system. This is not simple college algebra, but it's also not science fiction. Similar models have been built to create efficiencies for more complex markets than photo licensing. (Personally, I envision a structure similar to the formulas used for pricing stock options. Here, rather than having a big/ask market of quotes, prices are derived from external data based on aggregating a weighting of historical pricing patterns for images with similar characteristics.)

So, who's going to do it? Therein lies the $64,000 question.

Here, there are two problems: As just described, there's a lot of technology here, ranging from all the various image-search algorithms to the pricing analytics. This by itself is already way beyond what any one company has done. So, whoever's going to attempt this must likely be large enough to go on a bit of a buying spree.

Secondly, there needs to be a general realization that there's money to be made. This is difficult when the cultural behaviors around photos is to share, steal, or do whatever you want. Most investors don't really see the "vision" of a viable worldwide market with these sorts of things become further embedded in our online social fabric. This, despite the fact that research shows that photo licensing is still a $25B industry anyway. Comprised mostly of peer-to-peer transactions, it is now precisely how the online advertising industry was before Google.

More reading on the true size of the photo industry can be found here.

Google could solve both problems outlined above, but that would introduce another problem: They are in the advertising business. If they were to enter a business that monetized content on the web, it would bring into question the objectivity of their search/ranking algorithms, which is the only reason people trust their advertising model. That is, buyers and advertisers believe the pricing model because Google currently has no financial interest in monetizing the content on any given site.

What the world needs is a search engine from a company whose business model is not to sell advertising. Yahoo is becoming a much more likely candidate for something like this (as I'd noted in this blog post), but the catch-22 here is that if they were that visionary, they'd have started this kind of development with their Flickr property years ago. Yes, Flickr could be a good launch pad for such an endeavor, but Flickr's too busy with other things to bother with such fantasies.

Then there's Microsoft: they have the money and the infrastructure to actually accomplish this feat, especially given their efforts to build a good search engine. Their physical and political proximity to Corbis (started by Bill Gates) could also be a substantial contributor to the transaction and pricing models that will be needed.

But I dont' have faith that we'll be seeing headlines in these areas anytime soon. That may change with the evolution of Web 3.0... For that, see this post:

http://www.danheller.com/blog/posts/economics-of-migrating-from-web-20-to-30.html

Labels: , , , , , , , , ,