Dataguise Survey Reveals 80 Percent of Enterprises Placing Importance on Identifying Sensitive Data
Social Security Numbers, Credit Card Information and other Personally Identifiable Information Important to Secure in Hadoop Deployments

According to analysts, IT organizations with Apache Hadoop deployments should be aware of the potential security problems. In particular, the use of Hadoop to combine and store data from several sources can result in a number of problems related to identifying and securing sensitive data. Hadoop deployments can include a variety of data classifications with disparate security requirements. The key to ensuring compliance is to select the appropriate security solution for the Hadoop distribution.
In the qualitative enterprise user Hadoop survey conducted by Dataguise, data from 62 enterprise respondents was collected during the recently held O'reilly Strata and RSA Conferences. Key findings of the survey included the following :
· 80% of the enterprises surveyed feel it is important to know whether sensitive data is stored in their Hadoop environment.
· 77% feel it is important to protect access to the sensitive data stored in their Hadoop environment.
· 33% store sensitive data in Hadoop, including social security numbers, credit card numbers and addresses.
· 43% of survey participants are currently testing Hadoop and 31% have active production environments.
· Data in Hadoop environments consists primarily of log files (55%), followed by structured DBMS data (36%) and mixed data types (24%).
· Company divisions using Hadoop include marketing (28%), sales (23%), customer support (23%) and the balance by other divisions.
· Major challenges faced during Hadoop implementations include lack of skills (35%), Hadoop usability (23%) and security management (21%).
As petabytes of new data accumulate and propagate across businesses, much of this data comes from external sources and from customer interaction channels, such as web sites, call centers, Facebook, and Twitter. Other data originates from traditional data repositories such as RDBMS and file servers. To mine these large volumes and varieties of data in a cost efficient way, companies are adopting new technologies such as Apache Hadoop. Line of business managers are benefiting from Hadoop and its ability to enable the analysis of data patterns previously inaccessible but security officers are concerned about the nature of the information and its uncontrolled accessibility. They are well aware of the potential catastrophic financial losses and the brand damage that compliance breaches can cause to their business.
Contact:
Joe Austin
The Ventana Group
Austin, TX
(818) 332-6166
joe.austin@ventanapr.com
http://www.theventanagroup.com
###
Tag Words:
hadoop, dataguise, tokenization, data masking, data encryption, big data, data desensitization
Categories: Technology