[cti-users] MTI Binding

classic Classic list List threaded Threaded
150 messages Options
1234 ... 8
Reply | Threaded
Open this post in threaded view
|

[cti-users] MTI Binding

Jordan, Bret
From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[cti-users] RE: MTI Binding

Cory Casanave

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

Reply | Threaded
Open this post in threaded view
|

[cti-users] Re: MTI Binding

Jordan, Bret
JSON does have schema and we have used it to create schema validation for JSON based TAXII which is in the wild today.


I am not against looking at RDF in JSON, but I would like to understand why it would be valuable over native JSON. Meaning, what does it give us and what is the complexity cost of using it for open source, web application, and APP developers. I want things as simple as possible to help us gain a higher percentage of adoption.  

Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Sep 30, 2015, at 18:20, Cory Casanave <[hidden email]> wrote:

What about RDF in JSON? This then has a well defined schema.
 
From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding
 
From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  
 

 

Thanks,
 
Bret
 
 
 
Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Wunder, John A.
In reply to this post by Cory Casanave
Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

John

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 


Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

John Anderson

Thanks for the opportunity to weigh in. I apologize in advance if my comment comes across as contrarian. Blame it on my newness to this group. 😊


Side note: The GitHub wiki is not optimal for threaded discussions. On a call yesterday, I thought someone mentioned Atlassian Confluence. Is that a possibility, or is there another OASIS-sponsored forum where we could carry on discussions?


JSA




From: [hidden email] <[hidden email]> on behalf of Wunder, John A. <[hidden email]>
Sent: Thursday, October 1, 2015 9:17 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: [cti-stix] Re: [cti-users] MTI Binding
 
Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

John

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis
MTI Format Analysis · STIXProject/schemas Wiki · GitHub
schemas - STIX Schema Development



On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 


Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

Cory Casanave
In reply to this post by Wunder, John A.

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

mdavidson

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

Cory Casanave

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Shawn Riley
Just wanted to share a couple links that might be of interest here for RDF translation. 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

Here is a online example of a RDF to multi-format translator.


On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 


Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

mdavidson

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From: [hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Shawn Riley
Mark-

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.


Shawn

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From: [hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 


Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Wunder, John A.
I read through that site and to be honest I’m still a little bit confused about what advantages it offers us as an exchange format vs. a binding to raw JSON. It looks more complicated and harder to parse…what does that extra complexity gain us?

(I don’t mean this to be confrontational, would just like to see it explained)

From: Shawn Riley
Date: Friday, October 2, 2015 at 10:28 AM
To: Mark Davidson
Cc: "John K. Smith", Cory Casanave, "Wunder, John A.", "Jordan, Bret", "[hidden email]", "[hidden email]"
Subject: Re: [cti-users] MTI Binding

Mark-

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.


Shawn

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From:[hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From:[hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From:[hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 


Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

Cory Casanave

John,

First, my assertion that RDF/JSON-LD or something like it will be needed is predicated on the point of the thread – that we need a single “MTI” for CTI. That would imply that this MTI fully covers all the possible CTI documents for all purposes and viewpoints within scope. Further, that this scope is mostly unchanged. The simplistic examples presented to not exercise this scope.

 

If, on the other hand, the desire is to define many small, granular and purpose-specific exchange schemas (e.g. a list of suspect IP addresses from a single party), then something like “raw” JSON (or simpler XML) may be sufficient, if such a granular exchange schema were somehow mapped to a more comprehensive data model. These purpose specific schemas seem like the idea of a “profile”, but much more granular than TAXII.

 

With that in mind, to more specifically answer your question, at minimum:

·         JSON-LD does not assist with parsing, it assists in interpreting what you parse.

·         JSON-LD provides a standards (W3C) based schema (RDF Schema) and a way to bind to that schema: “Context”

·         JSON-LD provides a way to identify elements (globally, using URI)

·         JSON-LD provides a way to reference elements (globally, using URI)

·         JSON-LD provides a way to query data (SPARQL)

·         There is more, but that would seem a good start.

Note that the current XML representation of STIX does all of the above as well, perhaps not as simply, but it does so we can consider them requirements. I can’t imagine an MTI being viable without these capabilities. It would seem to be a very bad idea to start with raw JSON and start adding such capabilities in an ad-hoc way.

 

RE: { ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}, Would JSON-LD (or something like it) take the place of the JSON listed above?

JSON-LD would (optionally) add marking to define where the text strings: {‘type’, ‘indicator’, ‘content-type’, ‘snort-signature’, ‘signature’, ‘alert any any’} are defined and what they mean. It is not “converted” to JSON, it is JSON. JSON is just nested pairs of name/value strings. “LD” defines the content of some of the strings.

 

-Cory

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Friday, October 02, 2015 10:42 AM
To: Shawn Riley; Davidson II, Mark S
Cc: John K. Smith; Cory Casanave; Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

I read through that site and to be honest I’m still a little bit confused about what advantages it offers us as an exchange format vs. a binding to raw JSON. It looks more complicated and harder to parse…what does that extra complexity gain us?

 

(I don’t mean this to be confrontational, would just like to see it explained)

 

From: Shawn Riley
Date: Friday, October 2, 2015 at 10:28 AM
To: Mark Davidson
Cc: "John K. Smith", Cory Casanave, "Wunder, John A.", "Jordan, Bret", "[hidden email]", "[hidden email]"
Subject: Re: [cti-users] MTI Binding

 

Mark-

 

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.

 

 

Shawn

 

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From:[hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From:[hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From:[hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Wunder, John A.
Thank you, this is what I was looking for. I also saw you commented on the wiki, which is great.

So as I interpret it, as a tool processing STIX data, the advantage of JSON-LD over JSON+JSON/Schema is:
- Non schema-aware parsers can better understand what they’re parsing (because some of the schema is represented in the instance documents)
- Standard approach for identifying elements using URI (seems useful to me, though couldn’t we just have a field called “id” and require that it be a URI?)
- Standard approach to referencing elements using URI (also seems useful, though again it seems like we could just have a field called “idref”)
- Data query is interesting, though I’m not sure that query is really a requirement of the format (the only place STIX has used it is data markings and profiles, both of which are things in STIX that don’t work that well)

Maybe it would be useful to have some examples of each approach so the developers can see what it looks like? I can commit to putting together some JSON/JSON-Schema examples next week. We obviously already have XML/XML-Schema examples. Can someone put together something in JSON-LD? How about we look at the campaign model in STIX?

John

PS: I think your idea to define small, granular, and purpose-specific exchange schemas is a compelling idea that we should explore. It seems like that would make STIX easier to bite off and chew for tools that only do a little (process IP blacklists) while still providing a consistent overarching model for the tools that do a lot such that they can easily fuse and correlate data across these multiple message types.

On Oct 2, 2015, at 11:22 AM, Cory Casanave <[hidden email]> wrote:

John,

First, my assertion that RDF/JSON-LD or something like it will be needed is predicated on the point of the thread – that we need a single “MTI” for CTI. That would imply that this MTI fully covers all the possible CTI documents for all purposes and viewpoints within scope. Further, that this scope is mostly unchanged. The simplistic examples presented to not exercise this scope.

 

If, on the other hand, the desire is to define many small, granular and purpose-specific exchange schemas (e.g. a list of suspect IP addresses from a single party), then something like “raw” JSON (or simpler XML) may be sufficient, if such a granular exchange schema were somehow mapped to a more comprehensive data model. These purpose specific schemas seem like the idea of a “profile”, but much more granular than TAXII.

 

With that in mind, to more specifically answer your question, at minimum:

·         JSON-LD does not assist with parsing, it assists in interpreting what you parse.

·         JSON-LD provides a standards (W3C) based schema (RDF Schema) and a way to bind to that schema: “Context”

·         JSON-LD provides a way to identify elements (globally, using URI)

·         JSON-LD provides a way to reference elements (globally, using URI)

·         JSON-LD provides a way to query data (SPARQL)

·         There is more, but that would seem a good start.

Note that the current XML representation of STIX does all of the above as well, perhaps not as simply, but it does so we can consider them requirements. I can’t imagine an MTI being viable without these capabilities. It would seem to be a very bad idea to start with raw JSON and start adding such capabilities in an ad-hoc way.

 

RE: { ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}, Would JSON-LD (or something like it) take the place of the JSON listed above?

JSON-LD would (optionally) add marking to define where the text strings: {‘type’, ‘indicator’, ‘content-type’, ‘snort-signature’, ‘signature’, ‘alert any any’} are defined and what they mean. It is not “converted” to JSON, it is JSON. JSON is just nested pairs of name/value strings. “LD” defines the content of some of the strings.

 

-Cory

 

From: Wunder, John A. [[hidden email]]
Sent: Friday, October 02, 2015 10:42 AM
To: Shawn Riley; Davidson II, Mark S
Cc: John K. Smith; Cory Casanave; Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

I read through that site and to be honest I’m still a little bit confused about what advantages it offers us as an exchange format vs. a binding to raw JSON. It looks more complicated and harder to parse…what does that extra complexity gain us?

 

(I don’t mean this to be confrontational, would just like to see it explained)

 

From: Shawn Riley
Date: Friday, October 2, 2015 at 10:28 AM
To: Mark Davidson
Cc: "John K. Smith", Cory Casanave, "Wunder, John A.", "Jordan, Bret", "[hidden email]", "[hidden email]"
Subject: Re: [cti-users] MTI Binding

 

Mark-

 

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.

 

 

Shawn

 

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From:[hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From:[hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From:[hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 

 


Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

Cory Casanave

Re: Examples.

Pick your examples, I can help out. Would prefer to baseline off of the same schema subset & example data, current STIX is fine to define the examples. I suggest at least one that is very simple “pure hierarchical data” and at least one with some related entities.

-Cory

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Wunder, John A.
Sent: Friday, October 02, 2015 11:36 AM
To: [hidden email]; [hidden email]
Subject: [cti-stix] Re: [cti-users] MTI Binding

 

Thank you, this is what I was looking for. I also saw you commented on the wiki, which is great.

 

So as I interpret it, as a tool processing STIX data, the advantage of JSON-LD over JSON+JSON/Schema is:

- Non schema-aware parsers can better understand what they’re parsing (because some of the schema is represented in the instance documents)

- Standard approach for identifying elements using URI (seems useful to me, though couldn’t we just have a field called “id” and require that it be a URI?)

- Standard approach to referencing elements using URI (also seems useful, though again it seems like we could just have a field called “idref”)

- Data query is interesting, though I’m not sure that query is really a requirement of the format (the only place STIX has used it is data markings and profiles, both of which are things in STIX that don’t work that well)

 

Maybe it would be useful to have some examples of each approach so the developers can see what it looks like? I can commit to putting together some JSON/JSON-Schema examples next week. We obviously already have XML/XML-Schema examples. Can someone put together something in JSON-LD? How about we look at the campaign model in STIX?

 

John

 

PS: I think your idea to define small, granular, and purpose-specific exchange schemas is a compelling idea that we should explore. It seems like that would make STIX easier to bite off and chew for tools that only do a little (process IP blacklists) while still providing a consistent overarching model for the tools that do a lot such that they can easily fuse and correlate data across these multiple message types.

 

On Oct 2, 2015, at 11:22 AM, Cory Casanave <[hidden email]> wrote:

 

John,

First, my assertion that RDF/JSON-LD or something like it will be needed is predicated on the point of the thread – that we need a single “MTI” for CTI. That would imply that this MTI fully covers all the possible CTI documents for all purposes and viewpoints within scope. Further, that this scope is mostly unchanged. The simplistic examples presented to not exercise this scope.

 

If, on the other hand, the desire is to define many small, granular and purpose-specific exchange schemas (e.g. a list of suspect IP addresses from a single party), then something like “raw” JSON (or simpler XML) may be sufficient, if such a granular exchange schema were somehow mapped to a more comprehensive data model. These purpose specific schemas seem like the idea of a “profile”, but much more granular than TAXII.

 

With that in mind, to more specifically answer your question, at minimum:

·         JSON-LD does not assist with parsing, it assists in interpreting what you parse.

·         JSON-LD provides a standards (W3C) based schema (RDF Schema) and a way to bind to that schema: “Context”

·         JSON-LD provides a way to identify elements (globally, using URI)

·         JSON-LD provides a way to reference elements (globally, using URI)

·         JSON-LD provides a way to query data (SPARQL)

·         There is more, but that would seem a good start.

Note that the current XML representation of STIX does all of the above as well, perhaps not as simply, but it does so we can consider them requirements. I can’t imagine an MTI being viable without these capabilities. It would seem to be a very bad idea to start with raw JSON and start adding such capabilities in an ad-hoc way.

 

RE: { ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}, Would JSON-LD (or something like it) take the place of the JSON listed above?

JSON-LD would (optionally) add marking to define where the text strings: {‘type’, ‘indicator’, ‘content-type’, ‘snort-signature’, ‘signature’, ‘alert any any’} are defined and what they mean. It is not “converted” to JSON, it is JSON. JSON is just nested pairs of name/value strings. “LD” defines the content of some of the strings.

 

-Cory

 

From: Wunder, John A. [[hidden email]]
Sent: Friday, October 02, 2015 10:42 AM
To: Shawn Riley; Davidson II, Mark S
Cc: John K. Smith; Cory Casanave; Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

I read through that site and to be honest I’m still a little bit confused about what advantages it offers us as an exchange format vs. a binding to raw JSON. It looks more complicated and harder to parse…what does that extra complexity gain us?

 

(I don’t mean this to be confrontational, would just like to see it explained)

 

From: Shawn Riley
Date: Friday, October 2, 2015 at 10:28 AM
To: Mark Davidson
Cc: "John K. Smith", Cory Casanave, "Wunder, John A.", "Jordan, Bret", "[hidden email]", "[hidden email]"
Subject: Re: [cti-users] MTI Binding

 

Mark-

 

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.

 

 

Shawn

 

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From:[hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From:[hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From:[hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

RE: [cti-users] MTI Binding

Camp, Warren (CTR)
In reply to this post by Shawn Riley

Is part of the issue that we are using a SQL database architecture versus use of a document database (MongoDB) or a graph database (neo4j)?

From: [hidden email] [mailto:[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 10:29 AM
To: Davidson II, Mark S
Cc: John K. Smith; Cory Casanave; Wunder, John A.; Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Mark-

 

It might be of interest to check out http://json-ld.org/  which contains documentation, specification info, and a JSON-LD playground. It's maintained by the W3C so fairly up to date.

 

 

Shawn

 

On Fri, Oct 2, 2015 at 9:42 AM, Davidson II, Mark S <[hidden email]> wrote:

How does something like JSON-LD fit into the serialization discussion? For the MTI format discussion we are talking about the thing that products will send to each other (I think, anyway). I did some quick reading on RDF / JSON-LD (complete newbie, forgive my ignorance), and I didn’t get a clear picture on how it would fit.

 

For instance, as a completely trivial example, imagine a tool sending indicators out to sensors:

 

{ ‘type’: ‘indicator’, ‘content-type’: ‘snort-signature’, ‘signature’: ‘alert any any’}

 

Would JSON-LD (or something like it) take the place of the JSON listed above? Or would JSON-LD get automagically translated into something that takes the place of the JSON listed above? Or am I completely off-base in my questions?

 

Thank you.

-Mark

 

From: John K. Smith [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 7:00 PM
To: Shawn Riley <[hidden email]>; Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Just my 2 cents … having used RDF, TTL etc for security ontologies, I think leveraging something like JSON-LD will help better adoption by broader group.

 

Seems like schema.org is using JSON-LD but I’m not sure to what extent.

 

Thanks,

 

JohnS

 

From: [hidden email] [[hidden email]] On Behalf Of Shawn Riley
Sent: Friday, October 02, 2015 2:45 AM
To: Cory Casanave <[hidden email]>
Cc: Davidson II, Mark S <[hidden email]>; Wunder, John A. <[hidden email]>; Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Just wanted to share a couple links that might be of interest here for RDF translation. 

 

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.

 

JSON-LD parser and serializer plugins for RDFLib (Python 2.5+)

 

Here is a online example of a RDF to multi-format translator.

 

 

On Thu, Oct 1, 2015 at 1:39 PM, Cory Casanave <[hidden email]> wrote:

Mark,

Do I see it today? no. There may be some but I don’t know of it.

Could it be used – sure. If you have very atomic data, like a sensor data, RDF can be VERY compact and understandable.

 

Since I NEVER program to the data syntax (Libraries and MDA magic do that) I really don’t care if the data is in JSON or XML, but some do, and I could see a sensor hard coded like that. So the reason I am suggesting looking at the JSON/RDF (JSON-LD) format is that it reads better (and easier to parse) than the same thing encoded in XML while supporting the requirements I mentioned.

 

I should have referenced the “standard” name: Json-ld

 

Other note: I have no vested interest in RDF technologies, its something I use where it is the best choice.

 

Here is some info on Wikipedia: https://en.wikipedia.org/wiki/JSON-LD

 

Other note: I’m not entirely convinced a single “MTI” is a good idea, but if it is a distributed graph structure is the only thing that would scale from a sensor report to a query across millions of data points.

 

 

From: Davidson II, Mark S [mailto:[hidden email]]
Sent: Thursday, October 01, 2015 1:24 PM
To: Cory Casanave; Wunder, John A.
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

Cory,

 

I’m a little unfamiliar with RDF, so I have a clarifying question. In terms of RDF in JSON, is that something that you see security products using directly to interoperate? E.g., my SIEM uses TAXII + STIX/RDF/JSON to talk to my Sensor?

 

Thank you.

-Mark

 

From: [hidden email] [[hidden email]] On Behalf Of Cory Casanave
Sent: Thursday, October 01, 2015 11:09 AM
To: Wunder, John A. <[hidden email]>
Cc: Jordan, Bret <[hidden email]>; [hidden email]; [hidden email]
Subject: RE: [cti-users] MTI Binding

 

John,

With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.

 

JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in Javascript. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is:

·         There is a well defined relationship between the client and server applications, usually under control of the same entity.

·         The server application is primarily in control of what the user will see through the client and how they interact.

·         There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic)

·         Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions.

·         The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed.

·         The data source is the complete authority, at least for the client application.

·         Things don’t change much and when they do it is under a controlled revision on “both ends”.

·         The application technology is tuned to the data structure – Javascript directly reflects JSON.

A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.

 

RDF & the “semantic web” stack has been designed with a very different set of assumptions:

·         Data providers and data consumers are independent and from different organizations, countries and communities.

·         Data providers and data consumers are independently managed.

·         Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it

·         There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition.

·         Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical.

·         Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form.

·         Things change all the time and at different rates

·         No data source is complete, clients may use multiple sources

·         Any number of technology stacks will be used for both data providers and consumers.

An example could be the position and path of all airliners, worldwide.

 

This difference in design intent results in some specific differences in the technology:

·         RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition.

·         JSON is a strict hierarchy, essentially nested name/value pairs

·         RDF has as its core a type system with ways to describe those types

·         JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean”

·         RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”)

·         JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped

·         RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc)

·         JSON is a data syntax

Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.

 

As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.

 

If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints. 

 

Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.

 

RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.

 

Regards,

Cory Casanave

 

 

From: Wunder, John A. [[hidden email]]
Sent: Thursday, October 01, 2015 9:18 AM
To: Cory Casanave
Cc: Jordan, Bret; [hidden email]; [hidden email]
Subject: Re: [cti-users] MTI Binding

 

Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?

 

If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.

 

John

 

https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

On Sep 30, 2015, at 8:20 PM, Cory Casanave <[hidden email]> wrote:

 

What about RDF in JSON? This then has a well defined schema.

 

From: [hidden email] [[hidden email]] On Behalf Of Jordan, Bret
Sent: Wednesday, September 30, 2015 6:56 PM
To: [hidden email]; [hidden email]
Subject: [cti-users] MTI Binding

 

From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.  

 

[1] https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] MTI Binding

Wunder, John A.
In reply to this post by Cory Casanave
How about we take two of the idioms on the stixproject.github.io site?


Thanks for helping out. I think it would be nice to see these as:

- Current STIX XML (Done already)
- Simplified XML (TBD, maybe if the JSON one is quick I’ll do this too)
- JSON/JSON-Schema (Wunder)
- JSON-LD (Casanave)
- Any others people are interest (PMML, Thrift, ProtoBuf, etc)

John

On Oct 2, 2015, at 11:50 AM, Cory Casanave <[hidden email]> wrote:

Re: Examples.

Pick your examples, I can help out. Would prefer to baseline off of the same schema subset & example data, current STIX is fine to define the examples. I suggest at least one that is very simple “pure hierarchical data” and at least one with some related entities.

-Cory

Reply | Threaded
Open this post in threaded view
|

[cti-users] Re: [cti-stix] [cti-users] MTI Binding

Jordan, Bret
I think this is a great idea..  


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Oct 2, 2015, at 14:08, Wunder, John A. <[hidden email]> wrote:

How about we take two of the idioms on the stixproject.github.io site?


Thanks for helping out. I think it would be nice to see these as:

- Current STIX XML (Done already)
- Simplified XML (TBD, maybe if the JSON one is quick I’ll do this too)
- JSON/JSON-Schema (Wunder)
- JSON-LD (Casanave)
- Any others people are interest (PMML, Thrift, ProtoBuf, etc)

John

On Oct 2, 2015, at 11:50 AM, Cory Casanave <[hidden email]> wrote:

Re: Examples.
Pick your examples, I can help out. Would prefer to baseline off of the same schema subset & example data, current STIX is fine to define the examples. I suggest at least one that is very simple “pure hierarchical data” and at least one with some related entities.
-Cory


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] Re: [cti-stix] [cti-users] MTI Binding

Terry MacDonald

+1. Is a nice idea as we can see a size and complexity comparison. Is there any chance each person can document the process that the generation took? I'm thinking it could be useful to see how complicated the toolchain for developing each type of output is.

Cheers
Terry MacDonald

On 3 Oct 2015 6:33 am, "Jordan, Bret" <[hidden email]> wrote:
I think this is a great idea..  


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Oct 2, 2015, at 14:08, Wunder, John A. <[hidden email]> wrote:

How about we take two of the idioms on the stixproject.github.io site?


Thanks for helping out. I think it would be nice to see these as:

- Current STIX XML (Done already)
- Simplified XML (TBD, maybe if the JSON one is quick I’ll do this too)
- JSON/JSON-Schema (Wunder)
- JSON-LD (Casanave)
- Any others people are interest (PMML, Thrift, ProtoBuf, etc)

John

On Oct 2, 2015, at 11:50 AM, Cory Casanave <[hidden email]> wrote:

Re: Examples.
Pick your examples, I can help out. Would prefer to baseline off of the same schema subset & example data, current STIX is fine to define the examples. I suggest at least one that is very simple “pure hierarchical data” and at least one with some related entities.
-Cory

Reply | Threaded
Open this post in threaded view
|

Re: [cti-users] [cti-stix] [cti-users] MTI Binding

Jordan, Bret
It would also be good to see how complex it is to consume the data in to some structs in say C++ or PHP/JavaScript.  


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Oct 2, 2015, at 15:51, Terry MacDonald <[hidden email]> wrote:

+1. Is a nice idea as we can see a size and complexity comparison. Is there any chance each person can document the process that the generation took? I'm thinking it could be useful to see how complicated the toolchain for developing each type of output is.

Cheers
Terry MacDonald

On 3 Oct 2015 6:33 am, "Jordan, Bret" <[hidden email]> wrote:
I think this is a great idea..  


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Oct 2, 2015, at 14:08, Wunder, John A. <[hidden email]> wrote:

How about we take two of the idioms on the stixproject.github.io site?


Thanks for helping out. I think it would be nice to see these as:

- Current STIX XML (Done already)
- Simplified XML (TBD, maybe if the JSON one is quick I’ll do this too)
- JSON/JSON-Schema (Wunder)
- JSON-LD (Casanave)
- Any others people are interest (PMML, Thrift, ProtoBuf, etc)

John

On Oct 2, 2015, at 11:50 AM, Cory Casanave <[hidden email]> wrote:

Re: Examples.
Pick your examples, I can help out. Would prefer to baseline off of the same schema subset & example data, current STIX is fine to define the examples. I suggest at least one that is very simple “pure hierarchical data” and at least one with some related entities.
-Cory



signature.asc (859 bytes) Download Attachment
1234 ... 8