The flashcards below were created by user burntoutmatch on FreezingBlue Flashcards.

  1. how do you set up a SPARQL query?
    • SPARQL is the standardized query language for RDF.
    • Set of triples: subject, predicate, object.
    • Only variables have question marks.
    • Subject is instance or class. Predicate is data property or object property. Object can be value, or class, or instance.
  2. function of a semicolon in a query
    Shortcut: a semicolon (;) can be used to separate two triple patterns that share the same subject.

    • ?country a type:LandlockedCountries ;
    • rdfs:label ?country_name ;
  3. PREFIX ab: <>
    SELECT ?craigEmail
    ?person ab:firstName "Craig" .
    ?person ab:email ?craigEmail .
    returns all email addresses that belong to people with the first name craig

    • --------------------------------
    • | craigEmail |
    • ================================
    • | "" |
    • | "" |
  4. PREFIX ab: <> 
    SELECT ?first ?last
    ?person ab:homeTel "(229) 276-5135" .
    ?person ab:firstName ?first .
    ?person ab:lastName ?last .

    returs firstname and lastname of people with the specific tlf number

    • first | last |
    • ======================
    • | "Richard" | "Mutt" |
  5. PREFIX a: <> 
    SELECT ?propertyName ?propertyValue
    ?person a:firstName "Cindy" .
    ?person a:lastName "Marshall" .
    ?person ?propertyName ?propertyValue .
    • The subject here is Cindy. The predicate here is data property. The object here are the data property values.
    • propertyName | propertyValue |
    • =====================================
    • | a:email | "" |
    • | a:homeTel | "(245) 646-5488" |
    • | a:lastName | "Marshall" |
    • | a:firstName | "Cindy"
  6. ?class rdfs:subClassOf ?theSuperClass
    • Everything that doesn't have a question mark doesn't show up in the results.
    • class | theSuperClass
    • Cat | Animal
    • Dog | Animal
    • Animal | Thing
  7. Q4. What do we mean by the word “semantics” in Information Systems? Why is “semantics” useful (or even perhaps necessary) for the use of information in the world today?
    Making something more semantic, means making something more meaningful. Semantic code refers to how one structures their coding content. Anyone can create and share data. Need because we are unable to automatically relate, disambiguate, understand, and abstract data, and distill them into knowledge we can reliably reason with. want to use machines to perform quality, reliable, distributed, and scalable analysis and reasoning with data. provides the glue for integration and analysis of heterogeneous data for comprehensive situational awareness, question answering, and knowledge discovery.
  8. Q5. What is Implicit Semantics? Give an example of how a semantic application can make use of implicit semantics.
    Co-occurrence of documents or terms in the same cluster after a clustering process based on some similarity measure is completed. • A document linked to another document via a hyperlink, potentially associating semantic metadata describing the concepts that relate the two documents. Like a link to the definition of a word. • The sort of semantics implied by two documents belonging to categories that are siblings of each other in a concept hierarchy. • Automatic classification of a document to broadly indicate what a document is about with respect to a chosen taxonomy. Further, use the implied semantics to disambiguate (does the word “palm” in a document refer to a palm tree, the palm of your hand, or a palm-top computer?). • Bioinformatics applications that exploit patterns like sequence alignment, secondary and tertiary protein structure analysis, and so forth.

    Implicit semantics. "This type of semantics refers to the kind that is implicit in data and that is not represented explicitly in any machine processable syntax."
  9. Q6. Suppose you saw the following comment on a social media web site, and you had to automatically infer what the content of the message was: “Lily I loved your cheryl tweedy do ... heart Amy.” What are the challenges to assigning meaning? How would you try and solve the problem?
    The post is referring to artist Lily Allen’s music track ‘Cheryl Tweedy’. The poster Amy also shares a first name with a popular artist ‘Amy Winehouse’. Assuming that the end goal is to annotate artist and track/album mentions, the task here is to decide whether entities Lily, Cheryl Tweedy and Amy in the post are of interest. In such cases of ambiguity, a knowledge base along with explicated relationships will provide context in addition to word distributions in a corpus. A domain model such as Music Brainz for example, will inform that that “Cheryl Tweedy” is a track by artist ‘Lily Allen’. `Amy Winehouse’ and `Lily Allen’ are different artists from different genres – Pop and Jazz respectively. The lack of additional support for `Amy’ from the knowledge base in spite of capitalized first letters and the sentence parse assigning a noun tag (see Figure 2) could be taken into consideration before annotating the mention.

    problem with only sentance grammer parsing... can still be nonsensical 

    Off-topic Noise: Given the tendency for users to digress in informal settings, removing off-topic noise is an important task toward understanding what the content is about.
  10. Q7. What is the idea of “Linked Data”? Why is linked data useful? What are the four rules of linked data?
    • Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other
    • methods. Gives context. 5 star data after markup and RDF ect.
    • ✦ Use URIs as names for things
    • ✦ Use HTTP URIs so that people can look up those
    • names.
    • ✦ When someone looks up a URI, provide useful
    • information, using the standards (RDF*, SPARQL)
    • ✦ Include links to other URIs. so that they can
    • discover more things.
    • It is the unexpected re-use of information which is the value added by the web.
  11. what is big data?
    extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. data sets that are so large or complex that traditional data processing applications are inadequate to deal with them. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy.
  12. what does SELECT * WHERE { ?first ?second wine:RedWine} return?
    • first - the wine name
    • second - rdf:type
  13. what does SELECT * WHERE { wine:RedWine ?second ?third} return?
    • second - the type of object ?third is, rdfs:comment, rdf:type, rdfs:subClassOf, rdf:type, rdf:type
    • third - "the comment annotation for RedWine", owl:Class, Wine, owl:Class, owl:Class
  14. What does SELECT * WHERE { ?first ?second wine:Wine} return
    • first - all the subclasses of the class Wine, including a blank node
    • second - explains what ?first is, that each are rdfs:subClassOf, except for the blank node which is rdf:first
  15. rdf triple
    • s, p, o.
    • The subject is, well, the subject. It identifies what object the triple is describing.The predicate defines the piece of data in the object we are giving a value to.The object is the actual value.
  16. blank node
    bnode, and the value is generated by the RDF Validator. It’s purpose is to make sure we can identify the subject where it hasn’t been specifically named.
  17. optional age example
    • SELECT ?name ?age
    • WHERE
    • {
    • ?person vcard:FN ?name .
    • OPTIONAL { ?person info:age ?age }
    • }
  18. microformats
    • Microformats is a community driven effort to extend
    • semantics to well specified, precise domains
  19. microdata example movie
    • <div itemscope itemtype ="">
    • <h1 itemprop="name">Avatar</h1>
    • <span>Director: <span itemprop="director">James Cameron</span> (born August 16,
    • 1954)</span>
    • <span itemprop="genre">Science fiction</span>
    • <a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a>
    • </div>
  20. RDFa light attributes
    vocab, typeof, property, resource, prefix
  21. RDFa light example movie
    • <div vocab=“” typeof=“Movie">
    • <h1 property="name">Avatar</h1>
    • <span>Director: <span property="director">James Cameron</span>
    • (born August 16, 1954)</span>
    • <span property="genre">Science fiction</span>
    • <a href="../movies/avatar-theatrical-trailer.html" property="trailer">Trailer</a>
    • </div>
  22. SSocW
    Semantic Social Web, Web 3.0
  23. LIWC
    • Linguistic Inquiry and Word Count.
    • First person words (I, me, my). Social Words. Positive / Negative Emotions. Cognitive Processes. Analytic refers to analytical or formal thinking.  Clout taps writing that is authoritative, confident, and exhibits leadership.  Authenticity refers to writing that is personal and honest.  Emotional tone is scored such that higher numbers are more positive and upbeat and lower numbers are more negative.
  24. Word Example Cried
    Part of 4 word categories, sadness, negative emotion, overall affect, and a past tense verb). This gives us clues as to the meaning behind the word.
  25. analysis examples
    LIWC, musicbrainz, urbandictionary
  26. social web data perspectives
    • what is being said? theme
    • where is it being said? spatial
    • when is it being said? temporal
  27. uri
    uniform resource identifier. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier.
  28. linked data foaf opera example
    This means that I can write in my FOAF file that I know Håkon Lie by using his URI in the Opera Community data, and a person or machine browsing that data can then follow that link and find all his friends. [Update:] Also, the Opera Community site allows you to register the RDF URI for yourelf on another site. This means that public data about you from different sites can be linked together into one web, and a person or machine starting with your Opera identity can find the others.
  29. Provenance
    Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The provenance of information is crucial to making determinations about whether information is trusted, how to integrate diverse information sources, and how to give credit to originators when reusing information. Where did the data come from and how did it get there? Important for semantics that data is machine interpretable/readable.
  30. provenance and business
    Business applications may exploit provenance in trusting a product as they consider the manufacturing processes involved.  The provenance of a cultural artefact in terms of its origins and prior ownerships is crucial to determine its authenticity.
  31. business provenance example buying a product / planet
    Every day we buy products that impact our planet. Opaque supply chains are devastating environments and compromising the wellbeing of people, animals and communities. Every product and business is different, but rarely do we have the information we need to make positive choices about what to buy.Provenance is a platform that empowers brands to take steps toward greater transparency by tracing the origins and histories of products. With our technology, you can easily gather and verify stories, keep them connected to physical things and embed them anywhere online. We aren’t claiming to be ‘ethical’ or ‘sustainable’, but we are striving for both these ideals through a firm commitment to open, accessible information.
  32. why is provenance important for business?
    trust (amazon ratings), collaboration (between producers, suppliers, retailers, end customers), integration, authenticity (product key, end fake claims, tackle counterfeits), security, loyalty (meaningful relationships with customers). traceability. a secure open registry.
  33. 3 classes prov-o
    An prov:Entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary.An prov:Activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities.An prov:Agent is something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent's activity.
  34. prov-o chart
    properties prov:used, prov:wasGeneratedBy, prov:wasInformedBy, and prov:wasDerivedFrom
  35. parsing
    Parsing is the process of analyzing text made of a sequence of tokens to determine its grammatical structure with respect to a given (more or less) formal grammar.The parser then builds a data structure based on the tokens. This data structure can then be used by a compiler, interpreter or translator to create an executable program or library.
  36. formal semantics
    Formal semantics. "Semantics that are represented in some well-formed syntactic form (governed by syntax rules) is referred to as formal semantics which is precise and unambiguous.
  37. clustering
    • Clustering is defined as the process
    • of grouping similar entities or objects together
    • in groups based on some notion of
    • similarity. Clustering is considered a form
    • of unsupervised learning. The applications
    • of clustering use a given similarity
    • metric and, as a result of the grouping of
    • data points into clusters, attempt to use this
    • information (implicit semantics) to learn
    • something about the interactions between
    • the clustered entities.
  38. powerful soft semantics
    • attempt to overcome the shortcomings of the rigid set-based interpretations
    • associated with currently prevalent representation mechanisms by allowing for
    • representation of degree of membership and degree of certainty.
  39. problem with semantics
    The problem with semantics is that it comes down to the human understanding of language in all its vagueness, ambiguity, and context sensitivity. In many cases, efforts to standardize meanings strip out vagueness and ambiguity from data when the goal is to foster precise, unambiguous, and consistent communication between systems. The problem with this approach, however, is that human communication is inherently vague, ambiguous, and relative.
  40. ontology
    According to Wikipedia, “an ontology formally represents knowledge as a hierarchy of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts. Ontologies are the structural frameworks for organizing information.”
  41. Fiancial Industry Business ontology
    a financial services industry to define terms, definitions, and synonyms. FIBO will contribute to regulation-driven transparency in the global financial system. “FIBO Foundations is the first of a set of important semantic building blocks that can be reused by the financial industry to construct powerful operational capabilities that can validate, link, classify, visualize and reason over data relationships in ways that will bring tremendous value to the business.
  42. Q7. There are three main ways to include in a web document: microdata, RDFa,
    and JSON-LD. Briefly describe each of these techniques, and list some advantages/
    disadvantages of each.
    Microdata: No real plans to update it. Only used in (X)HTML5/HTML. Difficult to use several vocabularies. creation of "rich snippets," which display more information on the search result pages than traditional listings.

    RDFa: published as W3C Recommendation, can be used in various host languages, i.e. several (X)HTML variants and XML, use of prefixes, RDFa allows to mix vocabularies, RDFa is an RDF serialization, and RDF is the foundation of W3C’s Semantic Web.

    JSON-LD - it's relative new protocol based on The main benefit is that you split representation layer (HTML) from semantic layer (JSON-LD). can make changes in JSON-LD without touching HTML. Like adding new fields, parameters, etc.
  43. For each of the HTML snippets in parts (a) and (b), write out the RDF triples embedded
    into the snippets. You should write the RDF with the turtle notation.

    <p vocab="" prefix="ov:" resource="#manu" typeof="Person">
    My name is
    <span property="name">Manu Sporny</span>
    and you can give me a ring via
    <span property="telephone">1-800-555-0199</span>.
    <img property="image" src="" />
    My favorite animal is the <span property="ov:preferredAnimal">Liger</span>.
    • @prefix ns1: <> .
    • @prefix ns2: <> .
    • @prefix ov: <> .
    • <> ns1:usesVocabulary ns2: .
    • <#manu> a ns2:Person;
    • ov:preferredAnimal "Liger";
    • ns2:image <>;
    • ns2:name "Manu Sporny";
    • ns2:telephone "1-800-555-0199" .
  44. For each of the HTML snippets in parts (a) and (b), write out the RDF triples embedded
    into the snippets. You should write the RDF with the turtle notation.

    <div vocab="" resource="#bbg" typeof="LocalBusiness">
    <h1 property="name">Beachwalk Beachwear &amp; Giftware</h1>
    <span property="description"> A superb collection of fine gifts and clothing
    to accent your stay in Mexico Beach.</span>
    <div property="address" resource="#bbg-address" typeof="PostalAddress">
    <span property="streetAddress">3102 Highway 98</span>
    <span property="addressLocality">Mexico Beach</span>,
    <span property="addressRegion">FL</span>
    Phone: <span property="telephone">850-648-4200</span>
    • @prefix ns1: <> .
    • @prefix ns2: <> .
    • <> ns2:usesVocabulary ns1: .
    • <#bbg> a ns1:LocalBusiness;
    • ns1:address <#bbg-address>;
    • ns1:description """ A superb collection of fine gifts and clothing
    • to accent your stay in Mexico Beach.""";
    • ns1:name "Beachwalk Beachwear & Giftware";
    • ns1:telephone "850-648-4200" .
    • <#bbg-address> a ns1:PostalAddress;
    • ns1:addressLocality "Mexico Beach";
    • ns1:addressRegion "FL";
    • ns1:streetAddress "3102 Highway 98" .
  45. REFIX ab: <> !
    SELECT *
    ?s ?p ?o .
    FILTER (regex(?o, "yahoo","i"))
    • | s | p | o |
    • <> | ab:email | "" |
  46. @prefix foaf: <> .

    _:a foaf:name "Johnny Lee Outlaw" .
    _:a foaf:mbox <> .
    _:b foaf:name "Peter Goodguy" .
    _:b foaf:mbox <> .
    _:c foaf:mbox <> .

    PREFIX foaf: <>
    SELECT ?name ?mbox
    { ?x foaf:name ?name .
    ?x foaf:mbox ?mbox }
    • name | mbox
    • "Johnny Lee Outlaw" | <>
    • "Peter Goodguy" | <>
  47. @prefix foaf: <> .

    _:a foaf:name "Alice" .
    _:b foaf:name "Bob" .

    PREFIX foaf: <>
    SELECT ?x ?name
    WHERE { ?x foaf:name ?name }
    • x | name
    • _:c | "Alice"
    • _:d | "Bob"
  48. @prefix dc: <> .
    @prefix : <> .
    @prefix ns: <> .

    :book1 dc:title "SPARQL Tutorial" .
    :book1 ns:price 42 .
    :book2 dc:title "The Semantic Web" .
    :book2 ns:price 23 .

    PREFIX dc: <>
    PREFIX ns: <>
    SELECT ?title ?price
    WHERE { ?x dc:title ?title .
    OPTIONAL { ?x ns:price ?price . FILTER (?price < 30) }
    • title | price
    • "SPARQL Tutorial" |

    "The Semantic Web" | 23
Card Set:
2016-12-06 22:54:11

info116 exam notes
Show Answers: