Computer Science辅导、辅导database留学生、讲解R编程设计、R辅导辅导留学生 Statistics统计、回归、迭代|

Content-Based Access Control
Submitted to the Department of Electrical Engineering and Computer Science and the
Graduate Faculty of the University of Kansas
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
Date defended: April 3, 2015The Dissertation Committee for Wenrong Zeng certifies
that this is the approved version of the following dissertation :
Date approved:
iiAbstract
In conventional database, the most popular access control model specifies policies explicitly
for each role of every user against each data object manually. Nowadays, in
large-scale content-centric data sharing, conventional approaches could be impractical
due to exponential explosion of the data growth and the sensitivity of data objects.
What’s more, conventional database access control policy will not be functional when
the semantic content of data is expected to play a role in access decisions. Users are often
over-privileged, and ex post facto auditing is enforced to detect misuse of the privileges.
Unfortunately, it is usually difficult to reverse the damage, as (large amount of)
data has been disclosed already. In this dissertation, we first introduce Content-Based
Access Control (CBAC), an innovative access control model for content-centric information
sharing. As a complement to conventional access control models, the CBAC
model makes access control decisions based on the content similarity between user
credentials and data content automatically. In CBAC, each user is allowed by a metarule
to access “a subset” of the designated data objects of a content-centric database,
while the boundary of the subset is dynamically determined by the textual content
of data objects. We then present an enforcement mechanism for CBAC that exploits
Oracles Virtual Private Database (VPD) to implement a row-wise access control and
to prevent data objects from being abused by unnecessary access admission. To further
improve the performance of the proposed approach, we introduce a content-based
blocking mechanism to improve the efficiency of CBAC enforcement to further reveal
a more relevant part of the data objects comparing with only using the user credentials
and data content. We also utilized several tagging mechanisms for more accurate texiiitual content matching for short text snippets (e.g. short VarChar attributes) to extract
topics other than pure word occurrences to represent the content of data. In the tagging
mechanism, the similarity of content is calculated not purely dependent on the word
occurrences but the semantic topics underneath the text content. Experimental results
show that CBAC makes accurate access control decisions with a small overhead.
ivAcknowledgements
In this section, I would like to express my gratitude to my advisors, my colleagues, my
committee members and my family for their encouragement, support and assistance
down along the road of my PhD study.
First of all, I would like to thank my advisor Dr. Bo Luo for his valuable guidance
during my thesis. He was a great advisor to work with. He originally led me to the
field of access control and patiently explained the fundamental background of what
it is, why it is important to database security and its potential impacts on big data
platform. He has been very supportive, and enthusiastic in all our discussions. He
usually inspires me with his solid background knowledge on database security and
kindly provides insights to draw conclusion on experimental results, which help me
to push the experiment forwards meanwhile consolidate my work. I would also like
to thank my previous advisor Dr. Xue-wen Chen. When I began my PhD, he led me
to machine learning field. He directed me to multi-label learning, which is another
major part of my PhD work. The guidance from him led me to explore multi-label applications
in image analysis with graphical modeling, and theoretical optimization of
multi-label improvements. I am also very gratefully thankful to my committee members:
Dr. Arvin Agah, Dr. Jerzy Grzymala-Busse, Dr. Prasad Kulkarni, and Dr. Alfred
Tat-kei Ho. They offer me professional suggestion on my proposal and dissertation.
They kindly have provided their insights on further directions and experiments with
my work. Without their help, I cannot finish the entire course of my PhD.
Secondly, I would like to thank all my colleagues in University of Kansas. They began
as colleagues and ended to be my best friends. They have provided a lot of happiness,
vsupports, and assistance in my life and study. Working together with everyone is a
memorable moment in the entire course of my study. Dr. Hongliang Fei, Dr. Yi Jia,
Dr. Jintao Zhang, and Junyan Li, I have been grateful meeting them in the painful yet
rewardful PhD study. When I met problems, they always provide valuable suggestion.
Besides, I should owe special thanks to Dr. Jong Cheol Jeong. He is a valuable
colleague to work with, thorough in mind and detail oriented in execution. He is
willing to help me whenever I feel confused and lost in research. His determination in
research has set him up as my role model always.
Last but not least, I want to thank my family. My parents gave me endless courage
and love during my PhD stage. They have visited me three times from China, and
supported me with all their assistance in my household. They are the best parents one
could ask. My husband, the one I am super lucky to have, is the most supportive,
patient, generous and humorous man in my life, who has brought the happiness and
healed the pain. My baby daughter, Brenda, I would like to thank her for being the
best project I have ever done. Although she does cry a lot, she laughs more. Her smile
is the best gift after work.
viContents
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related Works 6
2.1 Access Control Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Discretionary Access Control . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Role-Based Access Control . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Attribute-Based Access Control . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.4 Policy-Based Access Control . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.5 Risk-Adaptabe Access Control . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.6 Access Control Based on Content . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Oracle Virtual Private Database (VPD) . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Text Feature Extraction 27
3.1 TF-IDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 Stop Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.2 Stemming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 n-Gram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Topic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Latent Dirichlet Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.2 Non-negative Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . 33
vii3.4 TAGME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Content-Based Access Control Model 37
4.1 Background and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Model Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Content Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.5 Top-K Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 CBAC Enforcement 47
5.1 CBAC On-the-Fly Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.1 The Basic CBAC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2 Offline CBAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.1 Unsupervised Nearest Neighbor Offline Training . . . . . . . . . . . . . . 56
5.2.1.1 Brute Force Algorithm . . . . . . . . . . . . . . . . . . . . . . 56
5.2.1.2 K-D tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2.1.3 Ball Tree Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6 CBAC Optimizing Strategies 67
6.1 Content-Based Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.1.1 Naive k-means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.1.2 The Advantage of Careful Seeding: k-means++ . . . . . . . . . . . . . . . 70
6.1.3 Scaled k-means++ with mini-batch Strategy . . . . . . . . . . . . . . . . . 70
6.1.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Content-based labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2.1 Document labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2.2 Soundness of CBAC Enforcement . . . . . . . . . . . . . . . . . . . . . . 76
viii6.2.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7 Labeling Improvement with Multi-Label Learning (MLL) 86
7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2 Problem Definition and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.5 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.5.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.5.2 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.5.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.6 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6.1 Data Set Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6.2 Comparison Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.6.3 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8 Discussions 107
8.1 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.2 Negative Rules and Conflict Resolution . . . . . . . . . . . . . . . . . . . . . . . 108
8.3 CBAC for XML Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9 Conclusion 110
A The Top 10 Words of Non-Negative Matrix Factorization 126
ixList of Figures
2.1 Selected Access Control Models (NIST (2009)) . . . . . . . . . . . . . . . . . . . 7
2.2 Access Control List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Capability List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Role-Based Access Control Model (Sandhu et al. (1996)) . . . . . . . . . . . . . . 16
2.5 Risk-Adaptable Access Control Notional Process (McGraw (2009)) . . . . . . . . 21
3.1 Plate Notation of Latent Dirichlet Allocation . . . . . . . . . . . . . . . . . . . . . 34
3.2 Plate Notation of Smoothed Latent Dirichlet Allocation . . . . . . . . . . . . . . . 34
3.3 TAGME Annotation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1 ABAC Efficiency with QUERY1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 ABAC Efficiency with QUERY2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Threshold CBAC Efficiency with QUERY1 . . . . . . . . . . . . . . . . . . . . . 58
5.4 Threshold CBAC Efficiency with QUERY2 . . . . . . . . . . . . . . . . . . . . . 61
5.5 Threshold CBAC + ABAC Efficiency with QUERY1 . . . . . . . . . . . . . . . . 62
5.6 Threshold CBAC + ABAC Efficiency with QUERY2 . . . . . . . . . . . . . . . . 63
5.7 Top-10 CBAC Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.8 2-D K-D Tree Subspace Splits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.9 K-D Tree Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.10 Offline Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.1 Threshold CBAC + Blocking Efficiency with QUERY1 . . . . . . . . . . . . . . . 72
x6.2 Threshold CBAC + Blocking Efficiency with QUERY2 . . . . . . . . . . . . . . . 73
6.3 Threshold CBAC + ABAC + Blocking Efficiency with QUERY1 . . . . . . . . . . 74
6.4 Threshold CBAC + ABAC + Blocking Efficiency with QUERY2 . . . . . . . . . . 75
6.5 Top-10 CBAC + Blocking Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6 Soundness of CBAC Enforcement . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.7 Threshold CBAC + Labeling Efficiency with QUERY1 . . . . . . . . . . . . . . . 80
6.8 Threshold CBAC + Labeling Efficiency with QUERY2 . . . . . . . . . . . . . . . 81
6.9 Threshold CBAC + ABAC + Labeling Efficiency with QUERY1 . . . . . . . . . . 81
6.10 Threshold CBAC + ABAC + Labeling Efficiency with QUERY2 . . . . . . . . . . 82
6.11 Top-10 CBAC + Labeling Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.12 Top-10 CBAC + Blocking + Labeling Efficiency . . . . . . . . . . . . . . . . . . . 83
6.13 Density Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.14 Cumulative Probability Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.15 NMF 100 Density Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.16 NMF 100 Cumulative Probability Fit . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1 Scene of Sunset at Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Molecular Function Annotation of P75957 . . . . . . . . . . . . . . . . . . . . . 94
xiList of Tables
2.1 Access Control Matrix Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 VPD Function Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 VPD Policy Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1 Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Column Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 CBAC Top-10 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 CBAC Threshold Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.1 Multi-Label Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2 Binary Relevance Matrix Example . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.3 Label Power-Set Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.4 Label Power-Set Matrix Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.5 Statistics of Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.6 Imbalance Rate (%) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.7 Sample Sizes of Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.8 Macro-Averaging F1 Measure (%) ↑ . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.9 Micro-Averaging F1 Measurel (%) ↑ . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.10 Subset Accuracy (%) ↑ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.1 The Top 10 Words of Non-Negative Matrix Factorization with 10 Topics . . . . . . 127
A.2 The Top 10 Words of Non-Negative Matrix Factorization with 20 Topics . . . . . . 128
xiiA.3 The Top 10 Words of Non-Negative Matrix Factorization with 50 Topics . . . . . . 129
A.4 The Top 10 Words of Non-Negative Matrix Factorization with 100 Topics . . . . . 133
xiiiChapter 1
Introduction
1.1 Introduction
Simply put, database access control models and enforcement mechanisms define and enforce "who
can access what". Here, "who" represents a set of users/roles, and "what" represents a set of
data objects, e.g. tuples or XML nodes, attributes of SQL databases. In conventional database
access control models, database administrators (DBAs) or data owners/users explicitly specify
access rights of each data object for each role by GRANT or REVOKE certain rights from each
role. However, due to the exponential explosion of data, especially for content-centric data, such
approaches may not be suitable or even practical. The reason for this is three-folded. Firstly, it is
determined by the characteristics of content-centric data. Content-centric data usually contains a
lot of free text. For example, electronic health record (EHR) is a content-centric kind of data. In the
EHR, doctors other than list the basic information of patients (eg. name, gender, age, etc.) describe
the symptoms of every patient. Instead of choosing exact words to describe the patients’ symptoms,
a doctor usually use more descriptive ways to record the symptoms given by the patients. That’s
why EHR data is rather free text kind of data than formatted text kind of data. Free text, as
the example shown, can express the same semantic meaning with different term distributions.
Secondly, in content-centric database, the data content is expected to play a role in making the
1access control decisions. Let’s continue on the EHR example. Before giving a concluded decision
on what the patient’s problem is, doctors might have needs to review some other patient’s record
with similar symptoms, especially for unusual diseases. For this kind of situation, it could be very
difficult to explicitly describe access rights for very large amounts of data objects, especially when
the decisions are based on content – it is too labor-intensive to require a system administrator to
manually examine every record in the database and assign access rights to each user/role. Thirdly,
in distributed and dynamic environments, it could be difficult to explicitly define access rights
for every user from remote peers, e.g., an organization could easily develop new roles without
notifying its collaborators, which happens a lot in information sharing. In this case, access control
decisions could be based on remote requestor’s knowledge that is dynamically submitted with
every query. Meanwhile, in distributed information sharing, data owners may only want to share
with people who contribute similar data which might reveal that they have similar interests due to
the sensitivity of the data content, but they cannot specify access control rules unless they explore
the content of others’ data. To further motivate this research, let us see the following examples:
Example 1: A law enforcement agency (e.g. FBI) holds a database of highly sensitive case
records. A director Bob assigns a case to agent Alice for investigation. Naturally, the director also
needs to grant Alice access to all related or similar cases. In this scenario, the concept of “related
cases” is determined by the semantical content similarity of the records, which could be geological,
temporal, motis operandi, or just the similarity in the textual description of the case records.
Moreover, when new cases are added to the database, cases that are similar to Alice’s should be
automatically made accessible to Alice, without requiring the director to further intervene. For
example, that new added related cases could be a crucial key to the case being investigated. Unfortunately,
in the existing database access control paradigm, this type of access control description
is not supported. Meanwhile, it is too labor-intensive for the director to manually examine every
record to grant/revoke access. In practice, the Multi-Level Security (MLS) model is often adopted
and every agent is granted access to a large number of records – everything lower than or equal to
his/her security level. Similarly, many content processing companies (e.g. survey processing and
2telemarketing firms) allow every employee to access all the (potentially sensitive) customer records
in their databases, due to lack of capability to enforce access control based on the textual content
of the records. In all these similar scenarios, information sharing could be either too conservative
or being abused because unnecessary information leakage.
Example 2: In traditional subscription systems, users pay for access towards entire periodicals.
For instance, a researcher interested in “information security” may subscribe to IEEE Transactions
on Knowledge and Data Engineering, though he is only interested in a small portion of the papers
in the journal. An alternative approach would be that each user subscribes to a set of tags, and
each paper (as a record in the database) is tagged by keywords. Thus, access control decisions
could be made by matching user’s tags with paper’s tags. However, such approach suffers two
serious drawbacks: (1) tag quality is essential to the approach, but the quality control is a nontrivial
problem; (2) the number of accessible papers could be too small or too large, for instance,
a paper carrying a tag may be only slightly related to the tag topic. In a desired solution, the
subscriber is expected to submit his interests as a textual description or identify some seed articles
(e.g. his own papers), and then be granted access to articles with similar content. In the ideal
solution, the granted access control policies based on content similarity would further improve the
work efficiency of users based on qualified selective articles.
Example 3: In distributed information sharing scenarios, some data owners will only share their
records with peers who contribute relevant data, so that the sharing is mutually beneficial. For
instance, in a collaborative project with Department of Public Administration studying citizen
engagement, surveyees are found to be willing to share their opinions with others who have similar
opinions. In this case, opinions are represented by a short paragraph of text. In other scientific
research domains, we also see investigators sharing research data (in a shared and access-controlled
repository) with colleagues who contributes similar data. Let us revisit Example 1: when FBI
3collaborates with other law enforcement agency (say, CIA), they only share “related cases”, while
the case relationships are accessed by semantical content similarity. Privacy-preserving similar
document matching (Murugesan et al. (2010); Scannapieco et al. (2007)) has been used to identify
and share similar documents. However, in the scenario that FBI is willing to disclose cases that are
similar to a known CIA case, an alternative solution is to employ database access control to allow
CIA to access the “similar cases”.
Example 4: Healthcare information sharing is strictly governed by HIPAA. Medical records are
well protected by healthcare providers, and are only shared under very rigorous rules. However,
within the facility, users (doctors, nurses, researchers) are often given broader access privileges,
while ex post facto auditing is enforced to detect and punish misuse of the privileges (Appari &
Johnson (2010); Malin et al. (2007); Boxwala et al. (2011); Rostad & Edsberg (2006)). Another
thrust of solutions employs the "break the glass (BTG)" mechanism – to allow users to break access
control rules in a controlled manner in special circumstances (Ferreira et al. (2006)). Additional
auditing will be performed once a user invokes the BTG policy.
From the examples, we can see that conventional deterministic database access control models
fall short in content-centric data sharing scenarios. In such cases, a new access control model
is expected to emerge to meet the needs of generating access decision based on the semantical
content similarity of the data. Another desirable capability of such content-based access control
model is the similarity of semantical content should be measured as native functions provided
by RDBMS, and it only requires minimal intervene from database administrators (DBAs). In
this dissertation, we present a first attempt towards this endeavor: we present the content-based
access control model and enforcement mechanisms, where access rights are granted based on the
lexicon similarity between requestor’s credentials and the requested records. The new model, as
a complement to existing access control approaches, provides an effective and efficient means of
access control that exploits content features in content-rich data sharing, and leads to a first effort to
4solve the difficulties in content-centric database access control in big data era. In the dissertation,
we explore the new needs of security and privacy in distributed information systems and decide
to tackle such issues with innovative designs. Therefore, we formally propose a new data-driven
access control model called content-based access control (CBAC) model which exploits the data
content to achieve more flexible and powerful access control semantics towards conetent-centric
databases in information sharing in the dissertation. CBAC is the first attempt to create access
control model that introduces the notion of approximate security, and it is capable of dealing with
situations where explicit access control policies are not at all available. In CBAC, we decide
to use machine learning methods (i.e. text mining techniques) for access control modeling and
enforcement. By introducing these methods, access control principle is translated into algorithm
implementation, and in the sense, we aim to enhance the dynamic properties, automation and
“intelligence” into access control models via all these techniques.
5Chapter 2
Related Works
Computer technology has transformed the way of daily life including education, career life, and
entertainment of people. It makes convenience for people to seek information for knowledge, find
jobs, enjoy fancy music and films. Meanwhile, computer technology also transformed the way
of running companies including hunting for suppliers to compete their offers, collecting, storing
and broadcasting their information of products, and maintaining their close work with clients. Not
only computer technology has improved the efficiency of everyone’s daily life, it has also changed
how information is created, processed, transferred, stored, and concealed. Nowadays, one of the
most important security problem is to prevent unauthorized access to information, which prevents
unauthorized people have access to credential information he/she is NOT allowed to. The common
risks from unauthorized access include but not limited to:
Unauthorized disclosure of information
Disruption of computer services
Loss of productivity which delaying normal computer activities in time critical applications
Financial loss such as corruption of information or disruption of services
Legal implications due to lawsuits from investors, customers, or the public
6Figure 2.1: Selected Access Control Models (NIST (2009))
Blackmail intruders extort money from the company by threatening the security system
To avoid these risks, researchers developed different access control models to paradigm of
"who" has the authorities to access "what". In this chapter, we select some common access control
models for introduction. Figure 2.1 is modified from Figure 1 (NIST (2009)) to show the relationship
among these models. We follow the list of access control models (NIST (2009)) and add more
details about models which have concrete mathematical definition.
Database access control research could be roughly categorized as access control models and
access control enforcement. Relational access control models can be classified into: mandatory
access control (Jajodia & Sandhu (1991); Sandhu (1993); Sandhu & Chen (1998); Winslett et al.
(1994); McCune et al. (2006); Lindqvist (2006); Thuraisingham (2009); Upadhyaya (2011)), discretionary
access control (DAC) (Moffett et al. (1990); Thomas et al. (1993); Ahn (2009); Li
7(2011); Downs et al. (1985)) and role-based access control (RBAC) (Ferraiolo et al. (2001); Osborn
et al. (2000); Sandhu et al. (1996)).
Mandatory access control (MAC) emphasizes only the database administrators have the authorities
to manage the access control policy and usage. These policies and usage cannot be modified
by any other users other than the administrators. Therefore, MAC is most often used in systems
or databases when the highest priority is placed on confidentiality. The assignment and enforcement
of access control policy under MAC models places strict restrictions on users. The dynamic
alteration of any access control policy requires detailed investigation of the policy itself purely by
database administrators manually. One obvious shortcoming is any update might introduce dilemmas
in the entire access control policies. Also frequent database updates will be labor-intensive
for administrators. Another shortcoming of MAC is it can be too protective to unnecessarily overclassify
data through “the high-water mark principle" and limit the ability of transfer information
between users and databases. On the other hand, most real world RDBMS implement a table/column
level DAC or RBAC similar to the one in System R (Griffiths & Wade (1976)).
Discretionary Access Control (DAC) is the type of access control where users has complete
authority over all the data they owns. Also they have the authorities to assign GRANT/REVOKE
to other users to access or not to their own data. DAC requires the permission assignment between
users who hold the data and who want to access the data. Thus, it is commonly known as the
“need-to-know" model. Compared to MAC, DAC shows an obvious advantage enabling finegrained
control over system or database objects. Data objects can have access control restrictions
with the minimum rights needed. However, security policies are extremely difficult for DAC as
the access control right is owned by users. Compromised users could pass potential threats to the
database and further them to other users. Thus, DAC has high potential to insecure problems.
Role-based access control (RBAC) is the type of access control model where users are firstly
assigned to different roles due to different job functions in an enterprise, and then the permission
are not directly assigned to users but to roles. The permission in contrast to the above two methods
of access control which GRANT/REVOKE user access on a rigid, object-by-object basis. In
8RBAC, users are easily to be granted or revoked accesses due to the change of their work status.
In large organizations, to cluster of many users into a single role allows much more convenient
management. RBAC also integrates support for least-privilege principle, duty separation, and role
membership central administration. Although RBAC shows a great advantage over the above two
conventional access control models. Meanwhile it also has its own limitations. In large systems,
role membership, hierarch

Computer Science辅导、辅导database留学生、讲解R编程设计、R辅导 辅导留学生 Statistics统计、回归、迭代|

Computer Science辅导、辅导database留学生、讲解R编程设计、R辅导辅导留学生 Statistics统计、回归、迭代|