All posts by datahive

IBM’s TrueNorth launches the concept of bio-inspired computing

Note: Although IBM’s TrueNorth announcement is not specific to social media, it is a fundamentally different way to deal with networked data with computing assets that simulate the human mind. It will provide the computing ability to analyze social media sentiment and intent thousands of times better than current computing hardware is able to do. As such, TrueNorth is an interesting technology that should be on your radar over the next 5-10 years as this technology is commercially launched.

On August 7, 2014, IBM announced the development of the TrueNorth microprocessor through a paper published in the journal Science. This chip represents the first true proof of concept for bringing a neural computing approach to remote and low-power environments. At the same time, this chip also manages to do this while working at only 70 mW, which is multiple orders of magnitude less than a similar processor working under traditional computing constraints.

This neural approach is important because brains are fundamentally different from traditional computing environments. We react to our environment based on sensory inputs that travel through neurons and start building patterns used to understand our environment. In contrast, a traditional computing microprocessor uses linear and sequential processing to solve problems through brute force. In addition, this approach brings memory and processing together, as each experiential event is actually tied to the optimization of data processing.

TrueNorth contains 1 million programmable neurons, 256 million programmable synapses, and 5.4 billion transistors, the latter of which compares to the transistor counts of the most advanced microprocessors that are commercially available. However, a traditional microprocessor with this transistor count, working under the von Neumann architecture that has defined computing for the last 70 years, typically requires over 100 watts to run. In comparison, the ability to run a microprocessor of this complexity at 1/1000th the power provides a unique opportunity to start bringing this level of cognitive intelligence to sensors, remote environments, and low-power environments that previously lacked the ability to cognitively analyze data.

The potential for TrueNorth is in providing high performance sensory computing to environments that cannot support traditional computing or server environments. 70 mW of power can be provided for an hour through commercial household batteries quite easily. By combining this low power requirement with on-demand sensory inputs, a TrueNorth-based sensor could potentially make intelligent and pattern-based observations of its environment for months or even years in remote and challenging environments.

TrueNorth was initially tested as a cluster of 16 chips put together into 16 million neurons, which roughly matches the size of a frog’s brain and spinal cord. This provides an immediate comparison for the potential opportunity of TrueNorth from a robotics perspective in being able to independently support automation on the complexity of a frog’s reflexes, range of motion, and bodily functions.

In comparison, with traditional computing approaches, scientists have struggled to simulate even a few hundred neurons, even with the computing power of over a billion transistors. IBM’s approach represents an improvement of multiple orders of magnitude in developing an artificial mind.

Bringing TrueNorth to the real world

The potential for this chip will only continue to scale as TrueNorth eventually goes into production and can be combined into multi-core clusters that can come closer to imitating the human brain. However, even without this clustering, the near-immediate opportunity is in providing a new level of “intelligence” to any device or sensor using over 1W of power, or to provide sensory analysis on demand in low power environments. Although IBM’s lead researcher on this project, IBM Fellow Dr. Dharmendra Modha, estimates that it will take 6-7 years for this chip to be fully ready for mass production, the variety of use cases for this chip may force IBM’s hand in accelerating this timeframe.

Some of the more obvious use cases for this chip include deep-sea and space exploration, remote environmental analysis, plant and refinery sensors, and augmented analysis for any equipment that already draws energy. In comparison to the multi-watt power requirements that most modern appliances have, an additional 70 mW would be a trivial energy draw for the level of insight that could be gained.

But this chip could also bring about the smart lamp, the smart electrical socket, or a truly smart watch that can put the quantified self in context of both the experiential and digital worlds. The Internet of Things could become a Neural Net of Things. Smart solar panels (which typically provide at least 75 to 100 watts of power) could provide insight into the true efficiency of solar conversion, based on weather and environmental factors, far more easily than current solar trackers. The pattern recognition associated with a neurosynaptic computing environment will also prove useful in synthetic visual, audio, and tactile inputs that could be used to enhance our own senses through smart glasses, hearing aids, or gloves. (Think of how TrueNorth-enabled gloves, robotics, and Oculus Rift could work together to create touch-sensitive virtual environments.)

Recreating the human brain

TrueNorth is a second generation chip that represents over a decade of research. It has been funded by DARPA since 2008 and was developed by IBM in collaboration with Cornell Tech and iniLabs. The goal of this research is to build a computing environment that can simulate the complexity and efficiency of the brain, which manages to power 100 billion neurons on 20 watts per day. To put that in perspective, 20 watts per day is a bit over 400 kCals per day, or about 2 ounces of gasoline. To put it another way, this is roughly the same amount of power that a smartphone would use in a day.

So, how close does TrueNorth actually get to matching the efficiency of the human brain? Here’s a quick comparison:

Neurons Wattage Neurons per Watt
TrueNorth 1 million 70 milliwatts 14 million:1
Human Brain 100 billion 20 watts 5 billion:1

A rough estimate says that TrueNorth is approximately 1/350th as efficient as the brain at present. This may seem like artificial intelligence is still quite far away from matching biological efficiencies. However, noted futurist Ray Kurzweil has often observed that technological progress does not happen linearly, but exponentially. This phenomenon is seen most obviously with Moore’s Law, which stated that the transistor count on a chip would double every two years. Combined with the physical expansion of computing capacity created, computational power ends up roughly doubling every year.
This is purely hypothetical, but consider if IBM is able to make the same or similar type of progress with TrueNorth. In a Kurzweilian vision of the world, getting 1% towards a technological goal means that it is likely to happen within seven years, since progress doubles every year. With this assumption, even being only 1/350th of the way towards a goal today would mean that it would happen in nine years.

Even if this timeframe is very optimistic, the ramifications of the creation of this chip are enormous. We now have a model for non-von Neumann computing that starts to imitate a more efficient and sensory model of processing. We can expect a chip that will match the processing efficiency of the human brain, if not the actual capacity, at some point in our lifetime. And this chip model changes both our assumptions of what can be computed in challenging environments and our assumptions on the ubiquity of cognitive computing.

The emergence of bio-inspired computing

This approach will lead to a fundamental change in how CIOs need to think about computing. Currently, we think of sensors, applications, and CPUs, but as computing more closely aligns to biological structures, mechanisms, and concepts, we will need new measures. One potential example is clocking computing based on a neuronal or fractional brain measurement rather than megahertz or gigahertz. Imagine being able to literally put 0.1% of your brain power (a milliBrain) into a specific observation and judging the complexity of a problem based on the amount of artificial brain power assigned to it. This type of computation does not fit into our current expectations for programming efforts, since the observations and pattern recognition capabilities of a fractional brain will not necessarily be standardized over time. As we start to equate computing challenges with the percentage of a human brain needed to process and analyze the challenge, fundamental assumptions of application development and application performance will change based on the reality of bio-inspired computing outputs.

The biggest gap in realizing this new paradigm is the current lack of expertise in creating neural applications. This trend is actually part of a larger technological trend where biology, data, and programming are increasingly coming together. In 2012, Harvard scientists placed 700 terabytes of data on a single gram of DNA, effectively translating the bases of DNA into binary data stores. Biofeedback is poised to be the next big gaming feature.With the second generation TrueNorth processor, IBM now has a chip that can imitate neuronal activity. These bio-inspired technologies provide the potential to sense, store, and analyze data several orders of magnitude more efficiently than traditional computing approaches allow.

However, we as a culture lack the developers to truly take advantage of these new biological programming assets. To move forward, IBM will need to make Compass (its TrueNorth simulator) and Corelet (its programming environment) more readily available. Corelet has been available for a year, but up until now the viability of this technology was not readily apparent to the commercial market. As a starting point, IBM seems to be focused on opening this environment up for video, signal, and object recognition to recreate the cognitive activities associated with sight. With time, IBM will also be able to support audio and tactile inputs to create neural computing environments that do not simply collect data, but can also react both cognitively and reflexively in real-time to changes in their environment.

Guidance for the bio-inspired paradigm shift

In short, the future is catching up to us faster than ever before. As amazing as all of this may sound, IBM has both the hardware and the beginnings of a new form of computing that will fundamentally change pattern and environmental recognition. Given that both the hardware and programming environments already exist, highly competitive technological industries (including financial services, petrochemical, and retail) and early adopter companies across all industries should at least be taking a look at this new developer environment and prepare for the near future. In today’s world of technology, proofs of concept like this turn into fully commercialized technology in a couple of years and can become a standard in 5-10 years. Because of the incredible rate of change in technological advancement, it is vital that forward-facing companies prepare for fundamental shifts in computing. Just as social networking and cloud computing were fundamental changes in the foundational basis of application development, this neural approach will be a sea change in application development and functionality.

5 B2B Marketing Tips for Using LinkedIn Shares

LinkedIn is the best site for Business to Business sharing, yet it is still underused and poorly used as an effective B2B channel. Some believe LinkedIn is just another Facebook while others believe it is more like Twitter. However, both of these approaches are wrong and will lead to social media usage that doesn’t give you any value.

One of the key challenges with LinkedIn is that it is a very personal social platform. We are used to seeing corporate accounts on Facebook at this point. We also see many organizations on Twitter talking about their value proposition and sharing information. But LinkedIn was not built simply to share information; it was fundamentally built to help employees to network and find jobs. Because of this, the individual is more important than the company in LinkedIn and any marketing or branding efforts on LinkedIn are best done through the employees. This can be a tricky process since it means that the company needs to use each individual’s brand. So, LinkedIn for advertising isn’t necessarily something that can work for everyone. But for those cases where employees and companies are mutually comfortable with using each others’ brands, LinkedIn can be a valuable social channel.

As the image above shows, people view the updates that I make on LinkedIn. And it’s not just that I know a lot of people: although I have over 1300 contacts on LinkedIn, I highly doubt that each and every one of them are checking LinkedIn daily and hanging on to my every word. And it’s not that I’m more clever than everyone else. In fact, I’m typically sharing other people’s content. Rather, just as with Twitter and Facebook, there is an art to viral LinkedIn content. To give you an idea of how LinkedIn content works, here’s a few suggestions supported by some real-world (i.e. my own) examples.

1) Make it work-related.

In general, we all share a number of basic challenges in the workplace such as sales, marketing, hiring, meetings, paperwork, and production. When I got over 1,000 likes, here’s the content that I shared:

EffectiveMeetings

Pretty simple, right? It’s pithy, directional, and has a bunch of great tips. The one thing that’s missing that I wish was there was some sort of branding. Is this from a company? An individual? A vendor who makes a solution? A consultant? I would want to keep this person on my short list, but I literally have no idea who to thank for this.

2) LinkedIn is not Facebook, although it can still be fun.

On Facebook, viral content tends to be very fun and flippant in nature. Videos and pictures of cats and other adorable animals, unfortunate accidents, silly children, charities, and math problems tend to do very well.

But LinkedIn is not Facebook. Here, your charity appeals, non sequitur pictures, and appeals to ask “how many squares do you see in this picture” will be of limited appeal. You’re talking to a working audience that wants some alignment between your comments and their workplace. Even if you’re going to be flippant, it still has to be work related. I’m not saying you can’t have fun with your LinkedIn updates; just be work-relevant. Here’s one of the lighter-hearted updates that I shared:

DoItCheaper

Again, notice what’s missing? No branding! It doesn’t have to be gigantic and obnoxious, but a small brand or URL in the corner wouldn’t have hurt this message. But it’s pretty obvious that people like the message. After all, many of us have been burnt either by buying something substandard or by losing a deal to a substandard provider. But don’t trust me; check out the numbers.

LinkedIn Price

By combining a key business lesson with a great picture, we got a great result. Be funny, but remain relevant.

3) LinkedIn is not Twitter

On Twitter, a good quote or helpful link on its own will often go far. However, on LinkedIn, it’s not good enough to just be helpful. Take the following update that I recently shared:

SocialSellingVP

It was very informative and it was shared from one of the top gurus in Social Selling, Jill Rowley. So, given that it was helpful, came from a legitimate expert and that I have a relatively large social following, how many views do you think I got?

LinkedIn Social Selling

That’s right. Just 38. But it just goes to show that a link or statement isn’t enough. And an appeal to authority isn’t enough. The graphical element is important as well.

To drive home the point, I also recently shared an article from the Harvard Business Review on the emotional boundaries needed at work. Harvard is obviously one of the most respected business brands in the world. And I thought the article was important and potentially helpful.

HBREmotionalBoundaries

It’s a good article about how we need to both consider how we need to protect other people from us and protect ourselves from others. If we beat up our employees day after day, “You may feel like a victim but will act like a bully.” Really smart stuff.

But, I didn’t add a graphic. And, from a “popularity” perspective, here was the result.

LinkedIn Emotion

Again, well under 100 views. It’s not because LinkedIn doesn’t like smart content; LinkedIn just requires a smart slide or diagram to go along with the concept.

4) LinkedIn content can be pretty sophisticated.

Unlike Twitter, you’re not limited to a single line of text. And unlike Facebook, you don’t have to dumb down your content. LinkedIn is actually a great place to share platforms, structures, and conceptual diagrams. For instance, I recently shared a great diagram equating Maslow’s hierarchy of needs to tiers of employee engagement.
Maslow’s Hierarchy of Needs Applied to Employee Engagement
Maslow’s Hierarchy of Needs Applied to Employee Engagement

It’s a relatively text-heavy diagram with multiple layers of maturity and levels of activity ranging from operational to tactical to strategic to aspirational. This type of content might work on Pinterest, but would likely be too complicated for Facebook, Twitter, or Instagram. But LinkedIn is different; your working friends are often looking for insight. (And, again, notice the lack of branding. People, put your brand on your content!)

But even so, don’t just believe me. Believe the numbers.

LinkedIn Maslow Engagement

It at least got the attention of 800+ people and 47 likes in a social setting that has already self-selected to be a professional, tech-savvy, and interested audience. It’s no secret that many managers are looking for a better way to keep their employees engaged.

So, those are the keys to LinkedIn viral content based not just on some opinions or by being a social guru, but from a basic demonstration of what actually works. So, remember both the lessons that were shown and the big one that was missing:

1) Make it work-related
2) Make it fun
3) Add a relevant slide or diagram
4) Don’t be afraid to be sophisticated or complex

And the missing number 5
5) Brand your work and take credit for it! You have no idea what kind of reach you may get!

With these five steps, you can take advantage of LinkedIn and get your message out to the right people as well.

If you’d like to learn more about LinkedIn or just get a free social consultation with us, please feel free to email us at [email protected] or call us at 415 754 9686. And if you want to work with us, please check out our Social Jumpstart to get a basic idea of how we work with clients.

You Can’t Spell Informatica without IT

Informatica has long been known as a data management and data integration company. Throughout its history, Informatica has been synonymous for being as close to enterprise data as a company can get and has been a market leader in ETL, data quality, and master data management. But it seemed in the mid-2000s that Informatica had hit a rut: it was the master of its chosen markets, but didn’t know where to go next. In addition, new competitors started to come into place, powered by the cloud and the resurgence of enterprise mobility started by the iPhone and continued by the rise of Android and Samsung. Informatica had two choices: innovate or slowly fade away.

At Informatica World 2014, Informatica showed that it was truly devoted to data innovation that took the key data-driven aspects of social, mobile, and cloud into account with the concept of the Intelligent Data Platform. We discussed some of the ramifications of the IDP in our prior post, Informatica Wants to Build the Most Relevant Version of the Truth.

Within this platform, Informatica launched a couple of innovative use cases that venture-backed Silicon Valley startups would be proud to bring to market with the potential to change IT departments: Project Springbok and [email protected]

Project Springbok is a self-service data harmonization product that greatly simplifies data quality efforts by providing an Excel-like self-service interface to access and enrich your datasets. Informatica’s VP of Platform Product Marketing Piet Loubser described this as “metadata meets machine learning,” which immediately grabbed our metadata-loving hearts at DataHive. As end users access data, Springbok will automatically determine the relative size of the dataset and the quality of the data compared to typical enterprise data. Based on these parameters, Springbok will provide suggestions to automate the quality of data. For instance, if a column in the dataset is supposed to be a “Yes/No” field, but has “Yes”, “Ye”, and “Y”, this will hamper analysis and basic fact gathering. Springbok automatically profiles the data and allows business users to see all of these options and then recommend or infer ways to fix these fields. Springbok will also take multiple datasets and suggest columns that should be joined to provide the most relevant version of the truth to the end user.

In and of itself, this self-service and machine-learning aided data wrangling is helpful but not unique in the market. (Paxata and Trifacta come to mind as early leaders in this area.) However, Project Springbok has some additional tricks up its sleeve. For starters, each data quality step that is conducted by an end user is automatically recorded and can be reused either by another user or by an admin who wants to clean up the data at its original source. By giving end users a simple point-and-click method for cleaning data while recording workflows that can be integrated into enterprise data management, Springbok has the potential to solve the core problems in data quality; the sheer man-hours needed to fix dirty data and the ability for relevant end users to make immediate fixes to data that can be reused at scale by data managers.

This capability reminded me of the work that the sabermetric world did in the 1980s and 1990s to share and analyze baseball data. By expanding the access of data from the chosen few in professional baseball to a wide range of professionals, academics, and students online, we started to create communities dedicated to both cleaning and understanding baseball data. This was an important precursor to the Moneyball era and without access to clean data, Moneyball would not have happened. The spread of self-service data management workflows that can be quickly created and brought back to a community, department, or company is similarly a precursor to a new era of data-driven business where a majority of employees can actually access and manage some aspect of their data in something other than the ubiquitous Excel without creating new data inconsistencies or errors.

Although the hype of Moneyball and the rise of the analytic enterprise have been heralded over the past several years, the business relevance may have been premature because enterprise data has still never gotten to the point where end users truly clean up the vast majority of business data in a coordinated and networked manner. This is a key opportunity for Informatica as it both validates data harmonization and user-based data preparation.

Also, Springbok takes the social aspect of data seriously. Although the idea of “social” in the enterprise is often relegated to the idea of social networks such as Facebook, the fundamental importance of “social” is in creating trusted collation and interest-based groups. Springbok plans to track the users who access data as well so that employees can start seeing who else may have previously transformed or modified data. This will allow end users to start independently creating social networks based on their shared interest in specific data and potentially unlock new patterns of data usage and data interest within the organization. By combining fundamental data cleansing and linkages, social linkages, and the repeatability of enterprise data workflows into a single product that works in context of Informatica’s vision of an Intelligent Data Platform, Springbok has an exciting opportunity to improve the data management market. This improvement will not come from the speeds and feeds and raw processing power that typically defines progress in the data management world, but from fundamentally improving the usability and visibility associated with data.

Project Springbok is currently available in beta with a target of General Availability in the fourth quarter of 2014. Putting on my fantasy baseball data crunching hat that I’ve worn for almost 20 years, I would think that Springbok would be most useful for cleaning up frequently used operational data that normal end users see. This lines up with sales and service (including internal IT help desk service, customer service, and field service) data created and used by the people who keep the lights on in an organization. Springbok will actually make it relatively easy for an Excel-savvy sales people to clean up and rationalize existing data while creating tools and workflows that will help the rest of the company. Every company has “last mile” data quality problems that can’t be solved with automation or by the few data management gurus on staff. Springbok is a promising tool both to engage end users with the data and to introduce employees to each other through the social use of data.

A second announcement that truly caught my eye as being far outside the typical interests of Informatica was [email protected], a data security product. Informatica CEO Sohaib Abbasi made clear in his keynote that “In the new world of pervasive computing – with Cloud services, mobile devices and sensors everywhere – there is no perimeter”.

This is fundamentally true. When data no longer lives on-premises and can be simultaneously accessed by thousands of mobile devices over cellular, Wi-Fi, and landline networks, what is the point of trying to secure a perimeter? In today’s mobile and cloud-based computing world, traditional security tools such as passwords and encryption are increasingly meaningless in the face of the raw computing power that can be thrown at these challenges. Mobile device security is advancing rapidly, but the adoption of these tools is still relatively low in the enterprise despite the success of SAP, VMWare Airwatch, Good Technology, Mobileiron, Blackberry, and others. The new paradigm of security must be shaped around the use and access of data itself. This is the Eureka that Informatica has discovered with [email protected]

[email protected] allows companies to classify access and prioritization to specific data, using the data lineage and data management tools that Informatica Powercenter has long been known for. [email protected] is designed to identify specific data that is at risk based on source, location, proliferation, or specific compliance concerns. It also defines data usage policies on the source of the data, which allows security policies to be abstracted from the delivery network, application, or the endpoint device. Once sensitive data has been identified and associated with relevant security policies, the data can be tagged and tracked as it is accessed by other applications, devices or reports. By getting to the crux of the problem, securing the data, Informatica is going to shake up traditional views of data security. Rather than focus on how to mask, encrypt, securely deliver, securely access, virtualize, and authenticate data, companies will now be able to focus on directly placing policies on the data itself.
This concept seems simple: secure the data at its source. But the giant security Goliaths that rule IT security currently do not have a competitive product to match this.

I feel like I’m seeing the emergence of a new security category: direct data security. And this category feels very similar to the mobile device management industry that emerged in the mid-2000s as the Blackberry started to fall out of vogue. Similar to mobile device management, which was largely ignored by the endpoint security vendors at the time until it was too late to gain significant market share, direct data security will be a field that largely escapes the security megavendors while nimble and data-fluent vendors grab the majority of the market. It will be interesting to see how long it takes for Informatica to start selling into the security market and for this product to start moving the needle on Informatica’s revenues.

And there is additional potential for this concept to further change the IT world as well. If you can track and manage data directly from the source to the end user, you can also measure utilization and traffic. This could lead to a more accurate and granular way of planning Wide Area Network deployments based on a better understanding of demand. Rather than simply estimating traffic based on certain types of content or application types, enterprises could directly track the data itself. Informatica has not announced any plans for network capacity monitoring or assessment, but this could be an interesting next step to take advantage of the data tracking and monitoring capabilities that Informatica has long had across the entirety of enterprise data.

[email protected] is going to be available in beta in the second half of 2014 with the goal of general availability in 2015. As a former enterprise IT guy, I believe IT departments should jump on this beta as it will add immediate value to security war rooms and NOCs (Network Operations Centers).

As IT has shifted from a hardware-based department to a data-driven department, management and security tools have not kept pace. With Informatica’s announcements last week at Informatica World, IT now has a chance to augment top-down data management efforts and perimeter-based security efforts with efforts that focus on people and data, which are the true foundations of any company. It can be easy for a multi-billion dollar company to rest on its laurels or to get distracted by side projects that do not solve core enterprise technology problems (as some in the data management and business intelligence vendor spaces have done) With these announcements, Informatica is stepping into new markets that both reflect its heritage and show the willingness to take risks and step outside of Informatica’s traditional comfort zone. These announcements fundamentally demonstrate the ongoing opportunities that exist in enterprise data, which is now the true foundation of IT in a social, mobile, and cloud-based world.

IndieWebify.Me and the Knowledge Gap

Last week, a friend asked me what I thought of IndieWebify.Me, a movement intended to allow people to publish on the web without relying on the tools and storage of the giant corporations that currently control the majority of the social web. I’m the kind of person who gladly supports her local independent bookstores and farmers’ markets and food purveyors, links to IndieBound.org instead of Amazon to buy books, and admires the ideals of Open Source Software. So, I’m biased towards an independent and open experience.

IndieWebCamp, the conference devoted to strengthening the Indie Web, describes the concept of the “Indie Web” thus: “We should all own the content we’re creating, rather than just posting to third-party content silos. Publish on your own domain, and syndicate out to silos. This is the basis of the ‘Indie Web’ movement.” You’d think I’d be all over a movement aimed at bringing back more of that feeling to the modern internet.

I’d love to be, but I can’t just yet. IndieWebify’s an ideal with some pretty serious barriers to implementation; key among them, the base level of knowledge necessary for the average citizen of the internet to “Indie Webify” themselves.

If you look at IndieWebify’s main page, there are three levels of “citizenship,” each with two steps to implementation. In theory, six steps don’t seem that challenging. Unfortunately, the reality is more like WordPress’ Famous Five Minute Install – it assumes familiarity with technical concepts that your mainstream Internet citizen lacks. I’m a reasonably tech-savvy person. I can write HTML and CSS and SQL and work with JavaScript and JQuery; I’ve maintained self-hosted websites for almost 15 years now. Steps 1 and 2 seem fairly straightforward – set up a domain name, then on the home page, add a few slightly enhanced links. Not too difficult. But Step 3 (the first step to publishing on the “Indie Web”) is more confusing: “Mark up your content with microformats2.”

Okay, clearly, I’ve got some reading to do, so I click through to learn about microformats2. The general idea isn’t too difficult for someone accustomed to writing HTML and CSS – microformats2 is a collection of standardized class names that should be applied to web content to help computers contextualize things like blog posts and comments. But this leads me to a lot of questions: Can I make my existing installation of WordPress automatically include the microformats2 markup when I write blog posts? (No.) Do I need to manually mark up my content every time I write a post? (Maybe, but that’s a long list of class names to memorize or be constantly referring to.) What is an h-card in this context? Why does it seem to represent multiple opposing standards? … and who do I know that knows how to use the existing “implementations” (which are actual code libraries to be imported and implemented, rather than more user-friendly plugins)?

Talk about jargon-filled! The amount of technobabble here depends on any users possessing a fairly high baseline of coding knowledge. Though I’m willing to click on the links to learn more, this process is nowhere near as quick and simple as joining an existing social site. And this is just step 3 of 6 – we haven’t even gotten to implementing the technology to have the federated (whoops, more technobabble) cross-site conversations that are the core that would allow for you to properly “own” and attribute all of your words to you in the context of your personal domain. Compare this to the existing Corporate Web options, like Facebook and Twitter and Google, where the only thing you need to know how to do is type the natural language words you want to share.

Even assuming you have the motivation to learn, this is not an easy proposition. Buzzfeed’s Charlie Warzel wrote of Twitter: “Ask a longtime user to tell you about their first experience with Twitter and they’ll probably lead with some variation of, “Somebody showed me how to use it…” The idea [is] that, unlike most social networks [today], you didn’t usually just discover and use Twitter – you are taught, or at least climb a fairly steep learning curve.” He then goes on to explain that this isn’t good enough anymore; that for Twitter to continue growing, they need to cater to the mainstream, and make it easier to understand. IndieWebify’s version of this is so far from that point of being accessible to the mainstream that even early adopters are barely on the horizon.

Noted tech evangelist Anil Dash has pointed out how this technical insularity burned the development of the Open Web in the past: “We took it as a self-evident and obvious goal that people would even want to participate in this medium, instead of doing the hard work necessary to make it a welcoming and rewarding place for the rest of the world. We favored obscure internecine battles about technical minutia over the hard, humbling work of engaging a billion people in connecting online, and setting the stage for the billions to come.” Right now, IndieWebify.Me feels like it’s a lot of technical minutia. Maybe that’s how it starts, but it needs to get beyond that for broader adoption.

So, if you’re one of the few who actually knows how to implement these new Open Web tools and want to see the Open Web succeed, what can you do to spread this? As I mentioned above, “somebody showed me how to use it” doesn’t scale, so new tools require accessible design and/or tutorials. The challenge is that IndieWebify.Me currently has a simplified set of instructions, but these still need to be translated further to the technical capabilities of the early adopters, not all of whom are programmers. In comparison, most new social apps and websites come with engaging tutorials that do not require learning a complex set of standards or platform protocols, or being tied to a dictionary of these terms. This is the opportunity for evangelists who are serious about the development of the Indie Web as a competitive and viable alternative: create tools that will let users add these capabilities to existing publishing platforms as easily as I installed Facebook and Twitter on my phone. Heck, WordPress itself is already Open Source. I’d love to be able to install a WordPress plugin that would IndieWebify this blog; there are some plugins out there for older microformats standards, but none fully supporting the microformats2 standard as far as I can tell. I don’t want to have to write my own CMS just to connect this blog to the Indie Web communications mechanisms.

Despite my idealism and my honest desire for an Open Web, I am concerned about IndieWebify’s ability to support this dream; it can’t be just a niche for techies. They need better outreach targeted to idealists like me whose desires outweigh their current coding capabilities, and they need to make the process itself much simpler. I hope the current model of IndieWebify is an intermediate step towards a simpler adoption pattern that will compete with Apple and Google from a usability perspective. In today’s computing world, usability has proven to be the ultimate judge of adoption as social tools such as Tumblr and WhatsApp have proven. By bridging the knowledge gap, the IndieWebify movement can go a long way towards building the next generation of the Open Web.

How #Gartner Got Gamification Wrong

Apparently, Gartner has decided to “Redefine Gamification.”

I’m torn. On the one hand, I’m glad that Gartner is putting a stake in the ground in an area where I’ve done projects on and off for the past decade. But on the other hand, the reasoning behind their definition is not backed up by the market or by hands-on experience.

Their definition is as follows:

“Gartner is redefining gamification as “the use of game mechanics and experience design to digitally engage and motivate people to achieve their goals”

That is not a bad definition. It’s a bit incomplete, but I could use this in a consulting engagement as a “Gartner-approved” definition without losing credibility or face.

However, I disagree with all of the element definitions, because those details would make me look amateur in my understanding of gamification. To explain why, let’s walk through this definition bit by bit.

“Game mechanics describes the use of elements such as points, badges and leaderboards that are common to many games.”

No. Game mechanics define game and play-based interactions, only some of which are defined by award-based elements such as points, badges, and leaderboards. This is a basic definition of what game mechanics are. Chess is a game. Hopscotch is a game. Gamification does not have to consist of point-based interactions to be gamification. Game mechanics also include the methods, rules, and actions used to play the game. Effective gamification requires efficient rule creation and engaging behaviors.

As a basic example, you could create a basic data quality project where everybody cleans up a contact list. For every missing field that gets filled in, somebody gets a point. And the top 5 go on a leaderboard. That’s the addition of a game element.

But to make a better gamification experience, you could add a manual aspect, such as clicking a big button that says “Next Challenge!” that brings you to the next field that needs to be filled in. Or to increase productivity, you could force employees to only move horizontally or vertically. you could change the text color when the player completes a certain number of fields within an hour. The addition of time constraints, buttons, and movement control are all game mechanics as well and they should be included in a grown-up gamification effort.

Fundamentally, “dynamics” are dynamic and represent some sort of action. The motions and efforts used to play a game are a vital part of game mechanics.

I realize that some of these options still don’t exist in existing “gamification” platforms. I would argue that this is part of why gamification still hasn’t taken off: the platforms in place aren’t ready for the level of gamification maturity that is necessary to truly optimize business processes.

“Experience design describes the journey players take with elements such as game play, play space and story line.”

Play space and game play are both important, but are very different skills and experiences. “Dungeons and Dragons” and “Hide and Seek” are both rewarding game experiences, but an open-ended scenario development design is a fundamentally different structure and space from a closed “tag, you’re it” design.

In addition, narrative is different from game mechanics. You can watch a movie or read a book and get a story line. Yet, none of us call these activities “playing a game” because they lack the interactive nature of a game. Although narrative is an interesting potential aspect of gamification, it should really be an optional add-on rather than part of the core gamification experience design.

Going back to the data quality challenge, there are ways to add game play aspects that are outside of the core game mechanics. You could change the text color every time you fill in 40 fields in an hour to move to the next level. Or add bonuses by filling in multiple fields in one row or by taking on a new empty field within 10 seconds of the last field.

“Gamification is a method to digitally engage, rather than personally engage, meaning that players interact with computers, smartphones, wearable monitors or other digital devices, rather than engaging with a person.”

Also disagree on multiple levels. Players can compete and engage directly with other players in a gamified experience, but there is a game design or interface that facilitates interaction. But computer-facilitated personal interaction is still personal interaction. The separation between interacting through a computer or in person reminds DataHive of the false dichotomy that existed between “on-line” and “off-line” friends in the pre-Facebook era. Now that almost all of us have at least one friend that we’ve spoken to “on-line,” that false comparison is disappearing and we call them all “friends.”

It is possible to have a gamification experience that is solely digital with one person against one environment, but this is more akin to an arcade game like “Pac Man” or “Donkey Kong” where only one person can play at a time, nobody else can interact or interfere, and there is a single goal to achieve. This type of gamification is ill-suited to the gamification of business processes to enable team or corporate goals. In business settings, there is typically some level of team or community aspect to gamification. As a general example, the existence of a scoreboard is a level of social ranking and interaction between players.

The advancement of wearable monitors and mobile devices means that the interactions for gamification may not even be manually entered into a digital device. Consider two field service employees or delivery people racing against each other through the city to complete their orders. Their actual behavior is only minimally related to computer-based interactions, yet through their mechanically tracked actions and a set of game mechanics, this could quickly become a very gamified work place.

For a data quality challenge, one could add a chat window for employees to trash talk or help each other. Or one could enable messages or user tags within each corrected cell to show who had corrected or filled in each cell. There are a number of ways to provide “digital engagement” that personally engages players. In fact, the goal should be to personally engage players through some sort of appeal. By taking away the personal engagement, you remove much of the ongoing benefit of a gamification environment.

“The goal of gamification is to motivate people to change behaviors or develop skills, or to drive innovation.”

No. The ultimate goal may be behavioral or skill change, but the tactical goal that gamification actually accomplishes is to drive a specific action or set of actions. Let’s not forget that we’re talking about “game-based behavior.” The initial goal is to provide a behavioral framework and incentive structure to get specific work done. This doesn’t have to result in a Pavlovian reaction or behavioral change any more than Tetris forced players to obsessively rearrange their furniture into neat rows. Once you leave the game, you may not necessarily change behavior. It can be enough to simply participate in a gamified environment willingly and learn the rules of the game.

“Gamification focuses on enabling players to achieve their goals. When organizational goals are aligned with player goals, the organization achieves its goals as a consequence of players achieving their goals.”

Actually, corporate gamification is about teaching players to achieve corporate goals. Don’t get me wrong: I believe that personal and corporate goals should be aligned in the workplace. But gamification should focus on the behaviors that make employees more effective and gamification should be used to get work done. Otherwise, what is the value? It’s odd to have to tell this to Gartner, but we don’t go to work in mid-sized or large companies just to do what we want. We go to work to achieve goals as a team that we can’t accomplish as individuals. Gamification should be built around the corporate goals first, then be customizable based on individual KPIs and achievements.

This is an area that gamification platforms are starting to understand by developing increasingly modular and individualized incentive structures that are still based on a corporate strategy and corporate decisions. Platforms that are still stuck in a top-down, one-size-fits-all model are point tools that may be well designed for specific tasks such as sales, marketing, or ideation, but are not well-suited for enterprise usage.

So, why is Gartner so far off?

It’s not because of the quality of the analyst: I have actually enjoyed the vast majority of Brian Burke’s work and consider him to be one of the top infrastructure analysts in general. But Burke is ignoring all of the real hands-on work that has occurred in gamification for decades and it looks like he is coming in as a newcomer in this space. There is nothing wrong with that, but Burke needs to come in with the understanding that many best practices in gamification are older than the best practices for hybrid IT architectures or data warehousing. This isn’t an area where Gartner can simply come in and redefine all of the existing terms and best practices.

But this description shows a lack of experience in gamification and I would hazard to guess that Gartner has few analysts who have built out gamification environments in the last five years or have expertise working with the likes of Bunchball, Badgeville, and Bigdoor or related software such as InnoCentive and Mindjet that create open scenarios for new ideas.

It’s a shame, since I expect that the net result will be that Gartner will continue to go along its own path and build a confusing definition of gamification that is poorly aligned with both the software and consulting markets. The last thing that an emerging market needs is confusion, especially since networked, social, and digitally enhanced work environments are going to continue to be hot topics over the next few years. This should be a good time for gamification efforts to succeed on top of existing social networking and enterprise mobility efforts. I only hope that we don’t lose the opportunity in the next few years because analysts try to create their own untested models for gamification.

Netflix Teaches Us the Grammar of Big Data

On January 2nd, 2014, Alexis Madrigal of The Atlantic wrote one of the best pieces on Big Data that we will see this year: How Netflix Reverse Engineered Hollywood. You should read the nearly 5,000 word piece in its entirety if you have any interest in Big Data, retail, or marketing since there are many insights on how to apply Big Data. We’ll just wait here while you read the story.

OK, now that we’ve all read this great article, what does it mean?

There are a few key lessons that DataHive took from Netflix’ 76,897 genres to keep in mind for 2014.

1. The key to Big Data is not processing; it’s metadata. It’s easy to think of metadata as an evil word in light of the recent NSA surveillance announcements. For those who are relatively new to the concepts of Big Data and metadata, it is easy to make the quick assumption that metadata is a bad thing and to wonder how personalized metadata collection could be helpful. This Netflix example shows the other side of metadata and how it has been fundamental to providing us with the personalized recommendations that we are starting to expect from our consumer services. For companies to make these suggestions, they first require the appropriate metadata.

2. Big Data requires a grammar and human interaction to be parsable.

For Netflix to make sense of the truly Big Data associated with millions of movies, it was not enough to simply use the existing Big Data of two+ hours of video, cast and crew information, and other existing data that typically require three or four gigabytes per movie. Instead, Netflix needed to create a dictionary and grammar of relevant phrases, then to farm out the metadata tagging and contextualization to people who could analyze the video data at a high-level and human level.

This is the process that intrigues DataHive the most: how to translate Big Data into actual insight.

One of the understated advantage that Netflix has had through its electronic collation of movies is the ability to create microgenres based on basic traits and personal preferences. Many of us have seen a Netflix category like “Oscar winning thrillers starring Meryl Streep” or something along those lines, but the logic involved is quite interesting. The role that Todd Yellin, Netflix’ VP of Product, played in translating movies into categories was vital in differentiating Netflix.

There is a very logical structure for these genres that the article describes:

Region + Adjectives + Noun Genre + Based On… + Set In… + From the… + About… + For Age X to Y

This Big Data “grammar” ends up being the secret sauce that Netflix uses to create these categories from a few hundred basic descriptors. However, Netflix also had to actually assign descriptors to each movie, which was an important additional process to implement.

Netflix actually rates all of its movies from a 1 to 5 scale in a number of different areas based on a human’s perspective. An actual viewer creates this initial rating, which is suspect to personal bias and perspective no matter how many rules are provided to the viewer. This human interaction ends up being vital to the processing and analysis of Big Data.

(This Big Data perspective is not unique to movies. In baseball, the raw data associated with player performance had been available for decades, but it was not until a volunteer SABR (Society for American Baseball Research) army set upon the data that baseball became the flagbearer of analytical thought and Big Data usage that it is today.)

The categorization of our daily lives is often based on a few hundred characteristics analyzed through Big Data. It is not the complexity of taxonomy or ontology that ends up providing the greatest insight. No matter who the data scientist is, Big Data insight needs to be clear to actual end users or else all that brilliance is wasted.

3. Video is the next Big thing in Big Data. Big Data is a lazy phrase used to describe several different types of trends: the exponential growth of structured data, the increased need to curate and collate unstructured data, the exponential growth of data sources, and the challenges of gaining insight from this data ecosystem that is literally too large to comprehend as we get up to petabyte-scale Big Data.

However, a fundamental component of Big Data is that it represents new data tools and challenges that have previously not existed in the enterprise analytics and data management worlds. In that context, video is going to be the next great challenge in Big Data. Consider how challenging text is as a Big Data challenge. It is not enough to store every character and create specific keyword relationships, as challenging as that is. True text analysis needs to include the close reading and sentiment analysis challenges associated with literature and poetry.

Video takes this analysis to another level. First, instead of analyzing characters, video analytics would ideally analyze each pixel on a frame-by-frame basis. But it would also understand the language being used in context. And it would also understand visual cues and outliers, such as the algorithms used for video security. But it is safe to say that no vendor or company has currently reached this level of sophistication. Even a Big Data pioneer like Netflix finds that it needs a certain level of human interaction simply to estimate the level to which a movie is a “Thriller” or a “Drama” or a “Comedy.”

Because of this, video analytics are still in their infancy. DataHive has a gut feeling that the phrase “Big Data” will go away in the next couple of years as Hadoop, sentiment analysis, and cloud-based storage and analytics continue to become more commonly used. However, we will need to brush the phrase back off, or perhaps find a new name, as video analytics finally is ready for its day in the sun. As interesting as companies such as Ooyala, with their current video analytic capabilities, already are, they have just scratched the surface of insight that they will eventually provide to the world.

So, the big takeaways for DataHive are threefold:

1) Big Data requires metadata to be useful. If the right metadata isn’t already in place, create it from scratch. It’s better to create a metadata layer that makes sense than to simply spin your wheels with existing Big Data that may lack the context needed to get from inputs to insights. Whether we’re talking about movies, baseball, or other favorite pastimes, Big Data only makes a big impact when it is filtered correctly.

2) Make sure that your metadata outputs are easily comprehensible. Netflix’ genre titles are easy to understand, regardless of whether an 8-year-old or an 8th degree Big Data Master is reading the title. By intentionally simplifying the categorization of films, Netflix provides much greater context. Netflix could easily have automated a list that states “Jennifer Lawrence has 34 minutes and 12 seconds of screen time in the Hunger Games” or whatever the number actually is, but this measurement is trivial compared to the traits and categories that customers actually want. Ultimately, Big Data outputs need to be tailored to users regardless of the complexity or elegance of analysis.

3) Video is ready for a quantum leap in analytics and Big Data. No vendor is currently able to bring the combination of video identification, video viewing analytics, video sentiment, and video content analysis together, meaning that this Big Data problem will be around for the foreseeable future. However, big challenges are also big opportunities. DataHive Consulting looks forward to playing a part in bringing these disparate video functions together into an integrated video analytics solution.