









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A new approach to reputation assessment in online service-oriented systems, focusing on the use of provenance records to account for mitigating circumstances and provide personalized assessments. The proposed architecture enables clients to request reputation assessments from an assessor, who uses provenance records as evidence. The approach allows for the consideration of context, such as mitigating circumstances, and the interdependencies of providers. Provenance records provide crucial evidence for assessing reputation, and the proposed method improves upon existing reputation models by taking into account the richer information provided in the approach.
What you will learn
Typology: Summaries
1 / 16
This page cannot be seen from the preview
Don't miss anything!
Simon Miles^1 and Nathan Griffiths^2 (^1) King’s College London, UK simon.miles@kcl.ac.uk (^2) University of Warwick, UK nathan.griffiths@warwick.ac.uk
Abstract. Reputation enables customers to select between providers, and balance risk against other aspects of service provision. For new providers that have yet to establish a track record, negative ratings can significantly impact on their chances of being selected. Existing work has shown that malicious or inaccurate reviews, and subjective differences, can be accounted for. However, an honest balanced review of service pro- vision may still be an unreliable predictor of future performance if the circumstances differ. Specifically, mitigating circumstances may have af- fected previous provision. For example, while a delivery service may gen- erally be reliable, a particular delivery may be delayed by unexpected flooding. A common way to ameliorate such effects is by weighting the influence of past events on reputation by their recency. In this paper, we argue that it is more effective to query detailed records of service provision, using patterns that describe the circumstances to determine the significance of previous interactions.
Keywords: Reputation, Trust, Provenance, Circumstances
In online service-oriented systems, an accurate assessment of reputation is essen- tial for selecting between alternative providers. Existing methods for reputation assessment have focused on coping with malicious or inaccurate ratings, and with subjective differences, and do not consider the full interaction history and context. The context of previous interactions contains information that could be valuable for reputation assessment. For example, there may have been mitigating circumstances for past failures, such as where a freak event affected provision, or a previously unreliable sub-provider has been replaced. Existing methods do not fully take into account the circumstances in which agents have previously acted, meaning that assessments may not reflect the current circumstances, and so be poor predictors of future interactions. In this paper, we present a reputa- tion assessment method based on querying detailed records of service provision, using patterns that describe the circumstances to determine the relevance of past interactions. Employing a standard provenance model for describing these circumstances, gives a practical means for agents to model, record and query the past. Specifically, the contributions of this paper are as follows.
An overview of our approach, with an example circumstance pattern and a high-level evaluation, appears in [10]. This paper extends that work, presenting an in-depth description of the approach and architecture for provenance-based reputation, additional circumstance patterns, and more extensive evaluation. Reputation and trust are closely related concepts, and there is a lack of consensus in the community regarding the distinction between them [11]. For clarity, in this paper we use the term reputation to encompass the concepts variously referred to as trust and reputation in the literature. We discuss related work in the following section, before presenting our ap- proach in Section 3. The baseline reputation model is described in Section 4 and we present example circumstance patterns in Section 5. Evaluation results are described in Section 6 and our conclusions in Section 7.
Given the importance of reputation in real-world environments, there continues to be active research interest in the area. There are several effective compu- tational reputation models, such as ReGreT [13], FIRE [7], TRAVOS [16] and HABIT [15] that draw on direct and indirect experiences to obtain numerical or probabilistic representations for reputation. In dynamic environments, where so- cial relationships evolve and the population changes, it can be difficult to assess reputation as there may be a lack of evidence [1, 7, 8, 14]. Stereotypes provide a useful bootstrapping mechanism, but there needs to be a sufficient evidence base from which to induce a prediction model [1, 3, 14, 18] Where there is little data for assessing reputation, individual pieces of evi- dence can carry great weight and, where negative, may cause a provider rarely to be selected, and never be given the opportunity to build their reputation. While reviewer honesty can be tested from past behaviour and dishonest re- views ignored, it is possible for a review to be accurately negative, because of poor service provision, and still not be an accurate predictor of future behaviour. These are examples of mitigating circumstances, where the context of service pro- vision rather than an agent’s ability meant that it was poorly provided, but that context was temporary. Many approaches use recency to ameliorate such effects. However, we argue that recency is a blunt instrument. First, recent provision may have been affected by mitigating circumstances, and recency will weight the results higher than older but more accurate data. Second, older interactions may remain good predictors of reliability, because of comparable circumstances.
Provenance records of client and acquaintances
Mitigation patterns
Fig. 2. An architecture for provenance-based service provider reputation
This allows mitigation, situation, indirect responsibility, and other such con- text to be accounted for, and the interdependencies of providers to be under- stood. Mitigation can have many forms, such as a subsequently replaced sub- contractor failing to deliver on time, or a client failing to specify required condi- tions (e.g. expiration date of goods being shipped). The assessor looks for pat- terns in the provenance that indicate situations relevant to the current client’s needs and mitigating circumstances affecting the providers. Provenance data is suitable for this because it includes the causal connections between interactions, and so captures the dependencies between agents’ actions. It can include multi- ple parties to an interaction and their organisational connections. The assessor filters the provenance for key subgraphs from which reputation can be assessed using existing approaches, by identifying successful and failed interactions and adjusting these by mitigation and situation relevance. Assessing reputation in this way avoids the problem of when to update trust, as whenever an assessment is required it is determined using all available evidence. Reputation enables the assessment and management of the risk associated with interacting with others, and enables agents to balance risk against factors such as cost when considering alternative providers. Such environments can be viewed as service-oriented systems, in which agents provide and consume ser- vices. We take an abstract view of service-oriented systems, without prescribing a particular technology. We assume that there are mechanisms for service adver- tisement and for service discovery. We also assume that service adverts can op- tionally include details of provision, such as specifiying particular sub-providers if appropriate. Finally, we assume that agents record details of their interactions in the form of provenance records, which can be used to assess reputation. The practicality of this last requirement is discussed in Section 6.3.
Provenance records not only contain rich information that enable reasoning about aspects such as mitigating circumstances, but they also provide a means to maximise the amount of information available for reputation assessment. In this section, we describe how reputation can be driven by provenance records. For the purposes of illustration we consider FIRE [7], but note that other ap-
proaches, such as those discussed in Section 2 or machine learning techniques, can similarly be adapted to use provenance records.
4.1 The FIRE reputation model
FIRE combines four different types of reputation and trust: interaction trust from direct experience, witness reputation from third party reports, role-based trust, and certified reputation based on third-party references [7]. The direct experience and witness reputation components are based on ReGreT [13]. In this paper our focus is on using provenance records of interactions to support reputation, and on defining query patterns for mitigating circumstances. Role- based trust and certified reputation are tangential to this focus, as they are not directly based on interaction records. Therefore, we do not consider role-based trust and certified reputation in this paper (although we do not argue against their usefulness). Reputation is assessed in FIRE from rating tuples of the form (a, b, c, i, v), where a and b are agents that participated in interaction i such that a gave b a rating of v ∈ [− 1 , +1] for the term c (e.g. reliability, quality, timeliness). A rating of +1 is absolutely positive, −1 is absolutely negative, and 0 is neutral. In FIRE, each agent has a history size H and stores the last H ratings it has given in its local database. FIRE gives more weight to recent interactions using a rating weight function, ωK , for each type of reputation, where K ∈ {I, W } representing interaction trust and witness reputation respectively. The trust value agent a has in b with respect to term c is calculated as the weighted mean of the available ratings:
TK (a, b, c) =
∑^ ri∈RK^ (a,b,c)^ ωK^ (ri)^ ·^ vi ri∈RK (a,b,c) ωK^ (ri)^
where RK (a, b, c) is the set of ratings stored by a regarding b for component K, and vi is the value of rating ri. To determine direct interaction reputation an assessing agent a extracts the set of ratings, RK (a, b, c), from its database that have the form (a, b, c, , ) where b is the agent being assessed, c is the term of interest, and “ ” matches any value. These ratings are scaled using a rating recency factor, λ, in the rating weight function, and combined using Equation 1. FIRE instantiates the rating weight function for interaction trust as:
ωI (ri) = e
∆t(ri) λ (^) (2)
where ωI (ri) is the weight for rating ri and ∆t(ri) the time since ri was recorded. Agents maintain a list of acquaintances, and use these to identify witnesses in order to evaluate witness reputation. Specifically, an evaluator a will ask its acquaintances for ratings of b for term c, who either return a rating or pass on the request to their acquaintances if they have not interacted with b. FIRE uses a variation of Yu and Singh’s referral system [22], with parameters to limit the branching factor and referral length to limit the propagation of requests. The
to determine the sustainability of a garment details of the fabric and raw ma- terials (e.g. cotton, dye, and fasteners) must also be evaluated. Terms are often domain-specific and are not further discussed here.
PROV data describes past processes as causal graphs, captured from multiple parties and interlinked. The interactions which comprise a service being provided can be described by a sub-graph, and inspecting features of the sub-graphs, such as through a SPARQL query [20], can determine the extent to which they inform reputation. In this section, we specify three mitigating circumstances patterns that could be detected in provenance data. These examples are not intended to be exhaustive, but illustrate the form of such patterns in our approach.
5.1 Unreliable sub-provider
In the first mitigating circumstance, a provider’s poor service on a past occasion was due to reliance on a poor sub-provider for some aspect of the service. If the provider has changed sub-provider, the past interaction should not be considered relevant to their current reputation^3. This is a richer way of accounting for sub-provider actions than simply discounting based on position in a delegation chain [4]. In other words, Provider A’s reputation should account for the fact that previous poor service was due to Provider A relying on Provider B, who they no longer use. The provenance should show:
A provenance pattern showing reliance on a sub-provider in a particular instance can be defined as follows. For reference, activities are labelled with An (where n is a number) and entities are labelled with En. Fig. 3 illustrates this pattern, along with some of the specific cases below.
Step 1 A client process, A1, sends a request, E1, for a service to a process, A2, for which Provider A is responsible. In the PROV graph, this means that E wasGeneratedBy A1, A2 used E1, and A2 wasAssociatedWith Provider A. Step 2 A2 sends a request, E2, to a service process, A3, for which Provider B is responsible. In the PROV graph, this means that E2 wasGeneratedBy A2, A3 used E2, and A3 wasAssociatedWith Provider B. Step 3 A3 completes the action and sends a result, E3, back to A2. In the PROV graph, this means that E3 wasGeneratedBy A3, and A2 used E3. (^3) Such a situation may indicate poor judgement and so have a degree of relevance, but this is not considered in this paper.
process (A1)Client^ Request (E1) Response (E4) process (A2)^ Provider A^ Request (E2) Response (E3) process (A3)^ Provider B
Provider A Provider B T
T
T
A = V T3 A = V
Fig. 3. Provenance graph pattern for unreliable sub-provider circumstance
Step 4 A2 completes the service provision, sending the result, E4, back to A1, so that the client has received the service requested. In the PROV graph, this means that E4 wasGeneratedBy A2, and A1 used E4.
We can distinguish cases in which Provider B would be the likely cause of poor quality service provision. Each case corresponds to an extension of the above provenance pattern. Case 1. An aspect of the result of provision is poor, and that aspect is apparent in Provider B’s contribution. For example, Provider A may have provided a website for a company which appears poor due to low resolution images supplied by Provider B. The extensions to the original pattern are as follows.
Case 2. The poor provision may not be due to eventual outcome but due to the time taken to provide the service, and this can be shown to be due to the slowness of Provider B. The extensions to the original pattern are as follows.
The final criterion required for the above patterns to affect Provider A’s reputation assessment is to show that Provider A no longer uses Provider B. This could be through (i) recent provenance of Provider A’s provision showing no use of Provider B, or (ii) Provider A’s advert for their service specifying which sub-provider they currently use. The latter is assumed the in evaluation below. We also note that a variation of this pattern is also useful, namely to identify situations in which successful service provision was due to a good sub-provider
Client process (A1)
Request (E1) Response (E2)
Provider A process (A2)
Provider A T
T2 A = V
Organisation B
Fig. 5. Provenance graph pattern for poor organisation culture circumstance
it (E3) was not, e.g. water damage affecting a parcel. Any delay between the request and response could be primarily due to the freak event (A3).
5.3 Poor organisation culture
In the third case, Provider A may be an individual within Organisation B. In such cases, the culture of the organisation affects the individual and the effectiveness of the individual affects the organisation. If Provider A leaves the organisation, this past relationship should be taken into account: Provider A may operate differently in a different organisational culture. The provenance should show:
A provenance pattern showing provision of a service within an organisation in a particular instance could be as follows (illustrated in Fig. 5).
Step 1 A client process, A1, sends a request, E1, for a service to A2, for which Provider A is responsible. In the provenance graph, this means that E wasGeneratedBy A1, A2 used E1, and A2 wasAssociatedWith Provider A. Step 2 Provider A is acting on behalf of Organisation B in performing A2. In the provenance graph, this means Provider A actedOnBehalfOf Organisation B in its responsibility for A2 (the latter not depicted Fig. 5 to retain clarity). Step 3 A2 completes the service provision sending the result, E2, back to A1, so that the client has received the service requested. In the provenance graph, this means that E2 wasGeneratedBy A2, and A1 used E2.
We can then distinguish the cases in which the culture of Organisation B may be a mitigating factor in Provider A’s poor provision. Poor performance is identified as described above: either an attribute indicating low quality, a part that is of low quality, or too long a period between the request and response. A variation on the circumstance is to observe where agents were, but are no longer, employed by organisations with a good culture.
We evaluated our approach through simulation, comparing it with FIRE, using an environment based on that used in the original evaluation of FIRE [7]. For transparency, the simulation code is published as open source^4.
6.1 Extending FIRE
Existing reputation methods do not account for mitigating circumstances and the context of service provision. The context of an interaction is not considered and there is no mechanism for considering mitigating circumstances. In our approach, each agent has its own provenance store, and to determine the reputation of a provider on behalf of a client the assessor queries that client’s provenance store and those of its acquaintances. For each interaction recorded in the provenance stores the outcome is considered according to the term(s) that the client is interested in. Since, for illustration, we adopt the FIRE model, the assessor extracts ratings from the provenance of the form ( , b, c, i, v), where b is the provider in interaction i, and the client in i gave b a rating of v for term c. These ratings are then used to determine reputation (using Equations 1 and 3). Mitigating circumstances and context can be incorporated into existing rep- utation models by adjusting the weighting that is given to the rating resulting from an interaction for which there are mitigating circumstances. In FIRE, this can be done through the rating weight function, ωK , for each type of reputation, where K ∈ {I, W }, by a factor that accounts for mitigation, specifically:
ωI (ri) = ωW (ri) = m (4)
where m is the mitigation weight factor. This factor reflects how convincing an agent considers particular mitigating circumstances, and is defined on a per pat- tern basis. For the sub-provider and organisation patterns this corresponds to the perceived contribution of a sub-provider or organisation to the service provi- sion, while for a freak event it corresponds to the perceived impact of the event. Mitigation weight factors can be estimated from knowledge of the system and each agent can ascribe a mitigation value to each of its mitigating circumstance patterns. For simplicity, however, we ascribe a global value to each pattern. Our FIRE implementation calculates trust on the basis of individual and witness experience, i.e. a client’s provenance records and those of its acquain- tances, applying equal weight to each, but we exclude role-based and certified trust as discussed in Section 4. The original evaluation of FIRE allows explo- ration of the space of providers, meaning that the most trusted provider is not always chosen. We include an exploration probability, e, where a client selects the most trusted provider with probability 1 − e, else will select the next most trusted with probability 1 − e, etc. This differs from the original evaluation of FIRE which uses Boltzmann exploration to reduce exploration over time. The
(^4) http://bit.ly/1uqLAZO
300 310 320 330 340 350 Round
10
15
20
25
30
Utility
Mitigating without recency FIRE with recency
Fig. 7. Per-round utility over one simulation
events, poor organisational culture). Our approach has improved performance, both with and without recency, over FIRE, with an improvement of 10.1% with- out and 9.3% with recency scaling respectively. The recency scaling of FIRE is also shown to be beneficial where mitigating circumstances are not taken into account, i.e. FIRE is better than FIRE without recency. These results match the intuition that recency is valuable for taking account of changes in circum- stances, but is crude compared to what is possible when past circumstances are visible. When recency is combined with mitigating circumstances there is negligible improvement, further supporting this intuition. We also considered how utility varied over a simulation, to better understand the results above. Fig. 7 shows the per-round utility for an extract of a single sim- ulation for FIRE and our approach without recency (other approaches are omit- ted for clarity). Utility varies significantly over time, as changing circumstances mean the most trusted agents may not be the best providers. Our approach has more and higher peaks than FIRE, leading to the higher cumulative utility described above. We believe that this is because our strategy recovers from a change in circumstance more quickly than FIRE. While FIRE’s recency scaling means that irrelevant past circumstances are eventually ignored, our approach immediately takes account of the difference in past and present circumstances. To understand how individual circumstances contributed to the results, we simulated the system with a single circumstance pattern applied. In the case of freak events (Fig. 8a) our approach performs similarly to FIRE, with a small improvement (1.1% in cumulative utility over 1000 rounds). As expected, FIRE without recency performs worse. Our approach has similar results with and with- out recency, implying that for a low incidence of freak events (25%), consider- ation of recency along with mitigating circumstances has little effect. For unre- liable sub-providers (Fig. 8b), there is value to scaling by recency in addition to considering mitigating circumstances. Our approach with recency performs similarly to FIRE (with a 1.6% improvement), but without recency scaling the utility is significantly lower. Note that both variants of the sub-provider pattern are used, and both poor and good interactions are scaled. With poor organisation
(^0 200 400) Rounds 600 800 1000
0
5000
10000
15000
20000
25000
Cumulative utility
Mitigating with recencyMitigating without recency FIRE with recencyFIRE without recency Random
(a) Freak event
(^0 200 400) Rounds 600 800 1000
0
5000
10000
15000
20000
Cumulative utility
Mitigating with recencyMitigating without recency FIRE with recencyFIRE without recency Random
(b) Unreliable sub-provider
(^0 200 400) Rounds 600 800 1000
0
2000
4000
6000
8000
10000
12000
14000
Cumulative utility
Mitigating with recencyMitigating without recency FIRE with recencyFIRE without recency Random
(c) Poor organisation culture
Fig. 8. Cumulative utility for use of the individual mitigating circumstances patterns.
culture (Fig. 8c) our approach, with and without recency, outperforms FIRE, with the largest improvement without recency (13.2%). Here recency scaling reduces performance, and we believe this is because the pattern identifies appro- priate situations, and additional scaling reduces the impact of relevant ratings.
6.3 Discussion
In this section, we attempt to answer questions about the results and approach. Why does accounting for recency seem to be a disadvantage in some results? Recency accounts for changes between the past and present, allowing obsolete information to be forgotten. Weighting relevance by matching against the current circumstance based on provenance patterns aims to account for the past more precisely. Therefore, where the circumstance patterns work as expected, also accounting for recency will dilute the precision, producing worse results. Why does the result with just unreliable sub-providers show a disadvantage for our approach? The results in Figure 8b show our strategy without recency being outperformed by our strategy with recency and FIRE. As discussed above, this suggests that the current pattern used for this circumstance does not provide the correct relevance weighting to account for the past precisely, and so recency is a valuable approximation. We have not yet determined why this pattern is imprecise, and it is under investigation. Why would providers capture provenance graphs? In a practical system, we must account for why provenance graphs would be captured and how they would be accessed by clients. Providers are the obvious source of the provenance data, as it is a record of service provision, but it may be against their interests to release records of poor performance. There are a few answers to this question, though full exploration of the issue is beyond the scope of this paper. First, contractual agreements between clients and providers can require some recording of details as part of providing the service, possibly with involvement of a notary to help ensure validity. In many domains such documentation is a contractual obligation, e.g. journalists must document evidence capture and financial services