Quantcast

Re: Similar effort to formally describe event data

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Wunder, John A.
Has anyone had a chance to look at the files that Phil sent out? If you don't want to go digging through them, here's an example of an event from their proposal:

{
      "eventReference":{
         "profileReference":{
            "uuid":"00112233-1234-5678-9abc-ed0123456789",
            "version":1
         },
         "eventId":1,
         "version":1
      },
      "criticality":"DEBUG",
      "dateTime":"2001-12-31T12:00:00",
      "recordId":"1234",
      "fields":[
         {
            "value":"jsmith"
            "name":"username",
         },
         {
            "name":"hostname",
            "value":"foo.dod.mil"
         }
      ]
 }

As you can see, there are a few differences from CEE JSON:

1. Profiles are referenced using a UUID and version rather than a URI
2. Profiles include a list of "event classes", which define specific fields that are required for specific event types. In contrast, CEE profiles (as currently implemented) tend to just provide a list of fields you CAN or MUST use for that profile without providing a list of event classes.
3. Events have a record ID (we've talked about this before but as currently implemented we don't require that)
4. Event fields are contained in a "fields" array with "name"/"value" key/value pairs rather than standard JSON key/value pairs.

Thoughts?

Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and prefer CEE to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
1. Why use a UUID rather than a URI? The advantages of a URI are that you can make it a resolveable URL for ease of use and it provides more human-readable profile references (i.e. in CEE I can tell what a profile is by looking at the URI, a UUID is transparent unless you know where to go to find it).
4. Why use the "fields" structure? I realize we've talked about this before, I just want to get the explanation on the list for everyone else to see.

John

-----Original Message-----
From: Wunder, John A. [mailto:[hidden email]]
Sent: Thursday, August 30, 2012 3:36 PM
To: cee-discussion-list CEE-Related Discussion
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event data

I've been working with Phil and the JALoP team to help them understand CEE and where we're coming from as well as to understand more about where they're coming from. So at this point I'm thinking I have a pretty good understanding of the differences.

In my opinion the most important distinction is that we have different focus areas. JALoP is focusing on syntax and (outside of the audit schema) the transport. CEE has the syntax component but also has more of a focus on defining the event taxonomy and core field dictionary. So to that extent if we move towards the JALoP schemas our existing work on the taxonomy and field dictionary wouldn't be significantly affected. The caveat is that we'll have to go back and forth a little bit on how the taxonomy ends up being represented, since although their concept is kind of a taxonomy it's implemented very differently from ours.
 
In the syntax specifically, the main philosophical difference is that CEE's primary focus has been JSON and lightweight uses while JALoPs has been XML and more "heavy" uses (that require validation, integrity, etc). That ends up being reflected in some of the structures that we use. The proposal Phil has here includes both JSON and XML, just like CEE does, but because it started as XML the JSON is influenced a little by that. Take a look at what he sent (the .piz file is a .zip file w/ the schemas) and compare it to what we have now.

I think it's unlikely that we'd end up doing a full replacement of the CEE syntax with the JALoP audit syntax, although of course that's definitely an option. But in any case we should look at what's out there and take the best pieces of each. If it would help I could set up a telecom where Phil and the team can explain what they've done and you all can ask questions.

I'd especially want to get the opinion of event record PRODUCERS here, since they're the one that will be using the syntax.

John

-----Original Message-----
From: Philip Black-Knight [mailto:[hidden email]]
Sent: Thursday, August 30, 2012 1:17 PM
To: cee-discussion-list CEE-Related Discussion
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event data

I'm working on a government funded project (JALoP). The primary goals are to author & implement a protocol to reliably transfer what we refer to as journal, audit, and log data. Much of our efforts are related to what CEE labels the CLT.

However, we've also done some work (i.e. the schemas I just sent out) related to describing log data in a strict format. Because our schemas are similar (and have similar goals) we were hoping that it would be possible to combine our efforts.

We feel that our schemas are a lot simpler and easier to understand than the current CEE approach, and would like to see the CEE format transformed into a something a little simpler and easier to follow.

-Phil

-----Original Message-----
From: Raffael Marty [mailto:[hidden email]]
Sent: Thursday, August 30, 2012 12:12 PM
To: Philip Black-Knight
Cc: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event data

Hi Phil,

When you say "we have been ...", who is we?

The goals you list are pretty much the ones we are addressing with CEE. Glad we are all on the same page. Would love to hear your specific input on CEE.

Thanks

  -raffy

--
  Raffael Marty
  @raffaelmarty                                          http://raffy.ch

On Aug 30, 2012, at 5:15 AM, Philip Black-Knight <[hidden email]> wrote:

> We've been working on a similar effort for describing an enumerating events. Our architecture is comprised of 3 distinct documents:
>
> -       EventProfileList (profiles.xsd) - An EventProfileList is an enumeration of events for a class of device (e.g. router, Cisco router, UNIX system, Solaris). The EventProfileList can be for a generic device (e.g. router, switch, UNIX system) or specific to a vender/product (e.g. Cisco Router, Juniper switch, Oracle Solaris). This is similar to how SNMP works. The type for each field is recorded in the EventProfile, and would be one of the existing CEE dictionary types.
>
> -       Events (events.xsd) - An Events document is a listing of one or more event instances (called records). Each record references an Event from a specific EventProfile.
>
> -       Translations (translations.xsd) - A Translation document provides a human readable, language specific, version of the events enumerated in an EventProfileList. To display the details of an Event, you would use a record from an Events document, combined with the translation from a Translations document, to render a human readable string in the user's language of choice.
>
> For example, an EventProfileList could include an event that indicates a user logged on to a particular system. In the EventProfileList, the descriptor for this login event would indicate the required fields: user_name, hostname, and tty. When a login event occurs, the system would record the event in an Events document, including the values for these fields. A Translations document (for US English) might include the format string: "%user_name% logged on to %hostname% on %tty% at %timestamp%". Using this format string, and the event record from the Events document, a human readable string can be rendered for the user.
>
> One of the primary goals is to move away from loosely formatted event messages which are hard to parse, and can change from one version of software package to the next. Using a common format to enumerate event records makes processing and correlating data simpler. This is all part of a larger effort to define data formats and protocols for reliably transferring Journal/Audit/Logging events. Additionally, we'd like to be able to align various software products that perform similar tasks (i.e. routers, or network switches, etc) to all use a common set of events so that a router built by Cisco & a router built by Juniper Networks would both generate the same events. Each vendor would be free to augment their list of events (by providing their own EventProfileLists) and would be able to define new events, and/or extend existing events.
>
> Existing syslog data is easily recorded using this approach. The EventProfile for syslog data indicates the required fields for msg (the actual syslog message), facility, priority, hostname, applicationName, etc that are part of standard syslog messages. The attached zip includes an EventProfileList & sample Events document for syslog.
>  
>
> Another important goal was to support existing logging systems. While creating these schemas, we studied Syslog, Windows Event Structure, Solaris Audit, and Linux Audit and believe that all can be represented in our Event structures with no loss of information or context.
>
> Other features:
>
> -       Support optional digital signatures of all XML files to meet government requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit log integrity and non-repudiation
>
> -       Updates to language translations (both new languages and to fix errors in translations) can be added without the need to update a single line of code.
>
> -       Both the EventProfileLists and the Event descriptors within the EventProfile list are versioned, allowing event records & translations to target a specific version of an event.
>
> -       Documents containing Event records may be in XML or JSON format
>
> -       Limited support for Event inheritance. Each Event descriptor can inherit from some other Event descriptor (multiple inheritance is not supported). For example, a generic profile could exist that describes a "logon" event, and specifies the required fields username, hostname, and time. The EventProfileList for Windows would identify a login event that inherits from this base, and adds the fields first_name, last_name, guid, etc. The EventProfileList for Linux would add fields for uid, tty, etc. When a Linux Logon event is recorded, in addition to the uid and tty, it must also specify the fields from the inherited class (i.e. username, hostname, and time). This is similar to current CEE approach to taxonomies.
>
>  
> To further support this effort, we've been working on a network
> protocol to reliably and securely transfer Event records off box. These protocols are designed to support raw XML as well as compressed formats such as Efficient XML Interchange Attached are a draft set of schemas and sample documents (it is just a renamed zip file).
>
> -Phil
>
> <eventSchemas.1814d42.piz>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Wunder, John A.
So I never really got a lot of response to this.

After our last CEE Board call (almost two weeks ago, I've been late sending this) there was generally agreement that considering such a big list of changes so close to 1.0 release is probably a bad idea. So the board's recommendation was to hold off on these changes until a post-1.0 release and then re-evaluate them one at a time as we find the time and the need.

Do people here agree with that? Does anyone here want to argue for considering one of these changes for the current version?

A related question is why doesn't CEE have a unique event identifier (#3 in the list of changes below) as a required field? I remember seeing it in and older version but it seems to have been removed. Does anyone know the reason for removing it? Should we consider adding it back? Or if I'm mistaken and CEE has never included it, should we add it?

John

>-----Original Message-----
>From: Wunder, John A. [mailto:[hidden email]]
>Sent: Wednesday, September 05, 2012 9:14 AM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event
>data
>
>Has anyone had a chance to look at the files that Phil sent out? If you don't
>want to go digging through them, here's an example of an event from their
>proposal:
>
>{
>      "eventReference":{
>         "profileReference":{
>            "uuid":"00112233-1234-5678-9abc-ed0123456789",
>            "version":1
>         },
>         "eventId":1,
>         "version":1
>      },
>      "criticality":"DEBUG",
>      "dateTime":"2001-12-31T12:00:00",
>      "recordId":"1234",
>      "fields":[
>         {
>            "value":"jsmith"
>            "name":"username",
>         },
>         {
>            "name":"hostname",
>            "value":"foo.dod.mil"
>         }
>      ]
> }
>
>As you can see, there are a few differences from CEE JSON:
>
>1. Profiles are referenced using a UUID and version rather than a URI
>2. Profiles include a list of "event classes", which define specific fields that are
>required for specific event types. In contrast, CEE profiles (as currently
>implemented) tend to just provide a list of fields you CAN or MUST use for
>that profile without providing a list of event classes.
>3. Events have a record ID (we've talked about this before but as currently
>implemented we don't require that)
>4. Event fields are contained in a "fields" array with "name"/"value" key/value
>pairs rather than standard JSON key/value pairs.
>
>Thoughts?
>
>Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and prefer CEE
>to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
>1. Why use a UUID rather than a URI? The advantages of a URI are that you
>can make it a resolveable URL for ease of use and it provides more human-
>readable profile references (i.e. in CEE I can tell what a profile is by looking at
>the URI, a UUID is transparent unless you know where to go to find it).
>4. Why use the "fields" structure? I realize we've talked about this before, I
>just want to get the explanation on the list for everyone else to see.
>
>John
>
>-----Original Message-----
>From: Wunder, John A. [mailto:[hidden email]]
>Sent: Thursday, August 30, 2012 3:36 PM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>I've been working with Phil and the JALoP team to help them understand CEE
>and where we're coming from as well as to understand more about where
>they're coming from. So at this point I'm thinking I have a pretty good
>understanding of the differences.
>
>In my opinion the most important distinction is that we have different focus
>areas. JALoP is focusing on syntax and (outside of the audit schema) the
>transport. CEE has the syntax component but also has more of a focus on
>defining the event taxonomy and core field dictionary. So to that extent if we
>move towards the JALoP schemas our existing work on the taxonomy and
>field dictionary wouldn't be significantly affected. The caveat is that we'll have
>to go back and forth a little bit on how the taxonomy ends up being
>represented, since although their concept is kind of a taxonomy it's
>implemented very differently from ours.
>
>In the syntax specifically, the main philosophical difference is that CEE's
>primary focus has been JSON and lightweight uses while JALoPs has been XML
>and more "heavy" uses (that require validation, integrity, etc). That ends up
>being reflected in some of the structures that we use. The proposal Phil has
>here includes both JSON and XML, just like CEE does, but because it started as
>XML the JSON is influenced a little by that. Take a look at what he sent (the
>.piz file is a .zip file w/ the schemas) and compare it to what we have now.
>
>I think it's unlikely that we'd end up doing a full replacement of the CEE syntax
>with the JALoP audit syntax, although of course that's definitely an option. But
>in any case we should look at what's out there and take the best pieces of
>each. If it would help I could set up a telecom where Phil and the team can
>explain what they've done and you all can ask questions.
>
>I'd especially want to get the opinion of event record PRODUCERS here, since
>they're the one that will be using the syntax.
>
>John
>
>-----Original Message-----
>From: Philip Black-Knight [mailto:[hidden email]]
>Sent: Thursday, August 30, 2012 1:17 PM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>I'm working on a government funded project (JALoP). The primary goals are to
>author & implement a protocol to reliably transfer what we refer to as journal,
>audit, and log data. Much of our efforts are related to what CEE labels the CLT.
>
>However, we've also done some work (i.e. the schemas I just sent out)
>related to describing log data in a strict format. Because our schemas are
>similar (and have similar goals) we were hoping that it would be possible to
>combine our efforts.
>
>We feel that our schemas are a lot simpler and easier to understand than the
>current CEE approach, and would like to see the CEE format transformed into
>a something a little simpler and easier to follow.
>
>-Phil
>
>-----Original Message-----
>From: Raffael Marty [mailto:[hidden email]]
>Sent: Thursday, August 30, 2012 12:12 PM
>To: Philip Black-Knight
>Cc: [hidden email]
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>Hi Phil,
>
>When you say "we have been ...", who is we?
>
>The goals you list are pretty much the ones we are addressing with CEE. Glad
>we are all on the same page. Would love to hear your specific input on CEE.
>
>Thanks
>
>  -raffy
>
>--
>  Raffael Marty
>  @raffaelmarty                                          http://raffy.ch
>
>On Aug 30, 2012, at 5:15 AM, Philip Black-Knight <[hidden email]>
>wrote:
>
>> We've been working on a similar effort for describing an enumerating
>events. Our architecture is comprised of 3 distinct documents:
>>
>> -       EventProfileList (profiles.xsd) - An EventProfileList is an enumeration of
>events for a class of device (e.g. router, Cisco router, UNIX system, Solaris).
>The EventProfileList can be for a generic device (e.g. router, switch, UNIX
>system) or specific to a vender/product (e.g. Cisco Router, Juniper switch,
>Oracle Solaris). This is similar to how SNMP works. The type for each field is
>recorded in the EventProfile, and would be one of the existing CEE dictionary
>types.
>>
>> -       Events (events.xsd) - An Events document is a listing of one or more
>event instances (called records). Each record references an Event from a
>specific EventProfile.
>>
>> -       Translations (translations.xsd) - A Translation document provides a
>human readable, language specific, version of the events enumerated in an
>EventProfileList. To display the details of an Event, you would use a record
>from an Events document, combined with the translation from a Translations
>document, to render a human readable string in the user's language of choice.
>>
>> For example, an EventProfileList could include an event that indicates a user
>logged on to a particular system. In the EventProfileList, the descriptor for this
>login event would indicate the required fields: user_name, hostname, and tty.
>When a login event occurs, the system would record the event in an Events
>document, including the values for these fields. A Translations document (for
>US English) might include the format string: "%user_name% logged on to
>%hostname% on %tty% at %timestamp%". Using this format string, and the
>event record from the Events document, a human readable string can be
>rendered for the user.
>>
>> One of the primary goals is to move away from loosely formatted event
>messages which are hard to parse, and can change from one version of
>software package to the next. Using a common format to enumerate event
>records makes processing and correlating data simpler. This is all part of a
>larger effort to define data formats and protocols for reliably transferring
>Journal/Audit/Logging events. Additionally, we'd like to be able to align
>various software products that perform similar tasks (i.e. routers, or network
>switches, etc) to all use a common set of events so that a router built by Cisco
>& a router built by Juniper Networks would both generate the same events.
>Each vendor would be free to augment their list of events (by providing their
>own EventProfileLists) and would be able to define new events, and/or
>extend existing events.
>>
>> Existing syslog data is easily recorded using this approach. The EventProfile
>for syslog data indicates the required fields for msg (the actual syslog
>message), facility, priority, hostname, applicationName, etc that are part of
>standard syslog messages. The attached zip includes an EventProfileList &
>sample Events document for syslog.
>>
>>
>> Another important goal was to support existing logging systems. While
>creating these schemas, we studied Syslog, Windows Event Structure, Solaris
>Audit, and Linux Audit and believe that all can be represented in our Event
>structures with no loss of information or context.
>>
>> Other features:
>>
>> -       Support optional digital signatures of all XML files to meet government
>requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit log
>integrity and non-repudiation
>>
>> -       Updates to language translations (both new languages and to fix errors
>in translations) can be added without the need to update a single line of code.
>>
>> -       Both the EventProfileLists and the Event descriptors within the
>EventProfile list are versioned, allowing event records & translations to target
>a specific version of an event.
>>
>> -       Documents containing Event records may be in XML or JSON format
>>
>> -       Limited support for Event inheritance. Each Event descriptor can inherit
>from some other Event descriptor (multiple inheritance is not supported). For
>example, a generic profile could exist that describes a "logon" event, and
>specifies the required fields username, hostname, and time. The
>EventProfileList for Windows would identify a login event that inherits from
>this base, and adds the fields first_name, last_name, guid, etc. The
>EventProfileList for Linux would add fields for uid, tty, etc. When a Linux Logon
>event is recorded, in addition to the uid and tty, it must also specify the fields
>from the inherited class (i.e. username, hostname, and time). This is similar to
>current CEE approach to taxonomies.
>>
>>
>> To further support this effort, we've been working on a network
>> protocol to reliably and securely transfer Event records off box. These
>protocols are designed to support raw XML as well as compressed formats
>such as Efficient XML Interchange Attached are a draft set of schemas and
>sample documents (it is just a renamed zip file).
>>
>> -Phil
>>
>> <eventSchemas.1814d42.piz>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Mitch Thomas
Please consider adding it (back).


On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:

>Should we consider adding it back? Or if I'm mistaken and CEE has never
>included it, should we add it?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

zrlram
What are the exact semantics of the field? An increasing ID? A UUID?
--
  Raffael Marty
  ceo @ pixlcloud                                        http://pixlcloud.com
  @raffaelmarty                                          http://raffy.ch



On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:

> Please consider adding it (back).
>
>
> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>
>> Should we consider adding it back? Or if I'm mistaken and CEE has never
>> included it, should we add it?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Mitch Thomas
I don't know the answer to your questions.

However I do have some experience with events and I've found, whatever the
format, its useful to have an identity with which two different pieces of
software can refer to the same event.   Its harder to assign events a
sort-able/Java-esque comparable identity, but the ability to order events
is also very useful particularly these days with the plethora of
distributed/concurrent computing going on.


On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:

>What are the exact semantics of the field? An increasing ID? A UUID?
>
>On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>
>> Please consider adding it (back).
>>
>>
>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>
>>> Should we consider adding it back? Or if I'm mistaken and CEE has never
>>> included it, should we add it?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Adam Montville
In reply to this post by zrlram
On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:

>What are the exact semantics of the field? An increasing ID? A UUID?


This is a good question.  How do you envision this working between use
cases?  We have n collectors receiving events.  For each collector, they
are either receiving CEE-formatted events or not.  In the case where they
are receiving CEE-formatted events, then what should we expect to see in
such a field?  On the other hand, if events are normalized into CEE, then
what would we expect to see?

I would argue that the latter need not be described in too much detail,
but that the former will need some operational support beyond the
specification, unless we attempt a globally unique identification scheme,
such as using a URI of some sort, which may become burdensome quickly.

Caveat with what I've said is that I'm dive bombing into the middle of a
discussion - fully aware that there may be information I'm lacking at this
point.  Mitch is probably closer to the discussion than I.

Regards,

Adam
 





>
>--
>  Raffael Marty
>  ceo @ pixlcloud
>http://pixlcloud.com
>  @raffaelmarty                                          http://raffy.ch
>
>
>
>On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>
>> Please consider adding it (back).
>>
>>
>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>
>>> Should we consider adding it back? Or if I'm mistaken and CEE has never
>>> included it, should we add it?
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Wunder, John A.
Definitely an important question. But, it's almost important to not overthink it.

Some of the options I can think of are:

1. UUID. Upside is a guarantee of uniqueness, downside might be generation time.
2. Unique string or integer assigned by producer. This could be implemented as an incrementing ID but no reason to require that in the spec. Basically the producer would just assign unique IDs as a best effort and consumers would need to know that there could be conflicts from different producers.
3. #2 plus add a producer (or "session") ID. That ID could be a UUID since it wouldn't need to be generated as often, but would give the producer some namespace in which to produce unique IDs.

Another thing to consider is whether the recordId would be required or optional.

John
________________________________________
From: Adam Montville [[hidden email]]
Sent: Monday, September 24, 2012 12:18 PM
To: cee-discussion-list CEE-Related Discussion
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event data

On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:

>What are the exact semantics of the field? An increasing ID? A UUID?


This is a good question.  How do you envision this working between use
cases?  We have n collectors receiving events.  For each collector, they
are either receiving CEE-formatted events or not.  In the case where they
are receiving CEE-formatted events, then what should we expect to see in
such a field?  On the other hand, if events are normalized into CEE, then
what would we expect to see?

I would argue that the latter need not be described in too much detail,
but that the former will need some operational support beyond the
specification, unless we attempt a globally unique identification scheme,
such as using a URI of some sort, which may become burdensome quickly.

Caveat with what I've said is that I'm dive bombing into the middle of a
discussion - fully aware that there may be information I'm lacking at this
point.  Mitch is probably closer to the discussion than I.

Regards,

Adam






>
>--
>  Raffael Marty
>  ceo @ pixlcloud
>http://pixlcloud.com
>  @raffaelmarty                                          http://raffy.ch
>
>
>
>On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>
>> Please consider adding it (back).
>>
>>
>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>
>>> Should we consider adding it back? Or if I'm mistaken and CEE has never
>>> included it, should we add it?
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

zrlram
#3 is completely different from the other semantics. That's a sessionID, which should not be in this field. Different field, different discussion. I think the idea here is a unique ID. I don't like #1. You can just use hash(data) as a UUID. Why require that? I think in general, I am against requiring it. I think we need it in general though.

  -raffy

--
  Raffael Marty
  ceo @ pixlcloud                                        http://pixlcloud.com
  @raffaelmarty                                          http://raffy.ch



On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]> wrote:

> Definitely an important question. But, it's almost important to not overthink it.
>
> Some of the options I can think of are:
>
> 1. UUID. Upside is a guarantee of uniqueness, downside might be generation time.
> 2. Unique string or integer assigned by producer. This could be implemented as an incrementing ID but no reason to require that in the spec. Basically the producer would just assign unique IDs as a best effort and consumers would need to know that there could be conflicts from different producers.
> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID since it wouldn't need to be generated as often, but would give the producer some namespace in which to produce unique IDs.
>
> Another thing to consider is whether the recordId would be required or optional.
>
> John
> ________________________________________
> From: Adam Montville [[hidden email]]
> Sent: Monday, September 24, 2012 12:18 PM
> To: cee-discussion-list CEE-Related Discussion
> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event data
>
> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
>
>> What are the exact semantics of the field? An increasing ID? A UUID?
>
>
> This is a good question.  How do you envision this working between use
> cases?  We have n collectors receiving events.  For each collector, they
> are either receiving CEE-formatted events or not.  In the case where they
> are receiving CEE-formatted events, then what should we expect to see in
> such a field?  On the other hand, if events are normalized into CEE, then
> what would we expect to see?
>
> I would argue that the latter need not be described in too much detail,
> but that the former will need some operational support beyond the
> specification, unless we attempt a globally unique identification scheme,
> such as using a URI of some sort, which may become burdensome quickly.
>
> Caveat with what I've said is that I'm dive bombing into the middle of a
> discussion - fully aware that there may be information I'm lacking at this
> point.  Mitch is probably closer to the discussion than I.
>
> Regards,
>
> Adam
>
>
>
>
>
>
>>
>> --
>> Raffael Marty
>> ceo @ pixlcloud
>> http://pixlcloud.com
>> @raffaelmarty                                          http://raffy.ch
>>
>>
>>
>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>>
>>> Please consider adding it (back).
>>>
>>>
>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>>
>>>> Should we consider adding it back? Or if I'm mistaken and CEE has never
>>>> included it, should we add it?
>>
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Adam Montville
On 9/24/12 9:44 AM, "Raffael Marty" <[hidden email]> wrote:

>#3 is completely different from the other semantics. That's a sessionID,
>which should not be in this field. Different field, different discussion.

Agree.

>I think the idea here is a unique ID. I don't like #1. You can just use
>hash(data) as a UUID. Why require that? I think in general, I am against
>requiring it. I think we need it in general though.

Which hash would you use and what are the precise uniqueness requirements?
 Global in the context of a single enterprise, or global across all
enterprises?  How many events would that mean we'd be generating IDs for
over what period of time?  Then, what are the collision rates of the
chosen hash?  It'll probably all work out, but being precise here might be
prudent.

About setting it as a requirement, what are the interoperability
implications if we don't require the field to be populated?


>
>  -raffy
>
>--
>  Raffael Marty
>  ceo @ pixlcloud
>http://pixlcloud.com
>  @raffaelmarty                                          http://raffy.ch
>
>
>
>On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]> wrote:
>
>> Definitely an important question. But, it's almost important to not
>>overthink it.
>>
>> Some of the options I can think of are:
>>
>> 1. UUID. Upside is a guarantee of uniqueness, downside might be
>>generation time.
>> 2. Unique string or integer assigned by producer. This could be
>>implemented as an incrementing ID but no reason to require that in the
>>spec. Basically the producer would just assign unique IDs as a best
>>effort and consumers would need to know that there could be conflicts
>>from different producers.
>> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID
>>since it wouldn't need to be generated as often, but would give the
>>producer some namespace in which to produce unique IDs.
>>
>> Another thing to consider is whether the recordId would be required or
>>optional.
>>
>> John
>> ________________________________________
>> From: Adam Montville [[hidden email]]
>> Sent: Monday, September 24, 2012 12:18 PM
>> To: cee-discussion-list CEE-Related Discussion
>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>event data
>>
>> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
>>
>>> What are the exact semantics of the field? An increasing ID? A UUID?
>>
>>
>> This is a good question.  How do you envision this working between use
>> cases?  We have n collectors receiving events.  For each collector, they
>> are either receiving CEE-formatted events or not.  In the case where
>>they
>> are receiving CEE-formatted events, then what should we expect to see in
>> such a field?  On the other hand, if events are normalized into CEE,
>>then
>> what would we expect to see?
>>
>> I would argue that the latter need not be described in too much detail,
>> but that the former will need some operational support beyond the
>> specification, unless we attempt a globally unique identification
>>scheme,
>> such as using a URI of some sort, which may become burdensome quickly.
>>
>> Caveat with what I've said is that I'm dive bombing into the middle of a
>> discussion - fully aware that there may be information I'm lacking at
>>this
>> point.  Mitch is probably closer to the discussion than I.
>>
>> Regards,
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>>
>>> --
>>> Raffael Marty
>>> ceo @ pixlcloud
>>> http://pixlcloud.com
>>> @raffaelmarty                                          http://raffy.ch
>>>
>>>
>>>
>>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>>>
>>>> Please consider adding it (back).
>>>>
>>>>
>>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>>>
>>>>> Should we consider adding it back? Or if I'm mistaken and CEE has
>>>>>never
>>>>> included it, should we add it?
>>>
>>>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Wunder, John A.
Well my point with #3 is that if you have a sessionId you don't need to worry about a globally unique recordId. You just create a session-unique recordID and combine it with the sessionId to give you a globally unique ID.

John
________________________________________
From: Adam Montville [[hidden email]]
Sent: Monday, September 24, 2012 12:53 PM
To: cee-discussion-list CEE-Related Discussion
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event data

On 9/24/12 9:44 AM, "Raffael Marty" <[hidden email]> wrote:

>#3 is completely different from the other semantics. That's a sessionID,
>which should not be in this field. Different field, different discussion.

Agree.

>I think the idea here is a unique ID. I don't like #1. You can just use
>hash(data) as a UUID. Why require that? I think in general, I am against
>requiring it. I think we need it in general though.

Which hash would you use and what are the precise uniqueness requirements?
 Global in the context of a single enterprise, or global across all
enterprises?  How many events would that mean we'd be generating IDs for
over what period of time?  Then, what are the collision rates of the
chosen hash?  It'll probably all work out, but being precise here might be
prudent.

About setting it as a requirement, what are the interoperability
implications if we don't require the field to be populated?


>
>  -raffy
>
>--
>  Raffael Marty
>  ceo @ pixlcloud
>http://pixlcloud.com
>  @raffaelmarty                                          http://raffy.ch
>
>
>
>On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]> wrote:
>
>> Definitely an important question. But, it's almost important to not
>>overthink it.
>>
>> Some of the options I can think of are:
>>
>> 1. UUID. Upside is a guarantee of uniqueness, downside might be
>>generation time.
>> 2. Unique string or integer assigned by producer. This could be
>>implemented as an incrementing ID but no reason to require that in the
>>spec. Basically the producer would just assign unique IDs as a best
>>effort and consumers would need to know that there could be conflicts
>>from different producers.
>> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID
>>since it wouldn't need to be generated as often, but would give the
>>producer some namespace in which to produce unique IDs.
>>
>> Another thing to consider is whether the recordId would be required or
>>optional.
>>
>> John
>> ________________________________________
>> From: Adam Montville [[hidden email]]
>> Sent: Monday, September 24, 2012 12:18 PM
>> To: cee-discussion-list CEE-Related Discussion
>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>event data
>>
>> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
>>
>>> What are the exact semantics of the field? An increasing ID? A UUID?
>>
>>
>> This is a good question.  How do you envision this working between use
>> cases?  We have n collectors receiving events.  For each collector, they
>> are either receiving CEE-formatted events or not.  In the case where
>>they
>> are receiving CEE-formatted events, then what should we expect to see in
>> such a field?  On the other hand, if events are normalized into CEE,
>>then
>> what would we expect to see?
>>
>> I would argue that the latter need not be described in too much detail,
>> but that the former will need some operational support beyond the
>> specification, unless we attempt a globally unique identification
>>scheme,
>> such as using a URI of some sort, which may become burdensome quickly.
>>
>> Caveat with what I've said is that I'm dive bombing into the middle of a
>> discussion - fully aware that there may be information I'm lacking at
>>this
>> point.  Mitch is probably closer to the discussion than I.
>>
>> Regards,
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>>
>>> --
>>> Raffael Marty
>>> ceo @ pixlcloud
>>> http://pixlcloud.com
>>> @raffaelmarty                                          http://raffy.ch
>>>
>>>
>>>
>>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>>>
>>>> Please consider adding it (back).
>>>>
>>>>
>>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>>>
>>>>> Should we consider adding it back? Or if I'm mistaken and CEE has
>>>>>never
>>>>> included it, should we add it?
>>>
>>>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Mitch Thomas
In reply to this post by zrlram
The hash approach is interesting in that it would add some anti-tampering
benefits.  I imagine it would have the same computational requirements as
the UUID, and as Adam M. pointed out lots more details would have to be
understood e.g. Which algorithm.  Further, it would be important to define
how to organize the data for hashing (the ordering of bytes is important)
for example if the data is JSON or XML text, in either case should white
space be considered and of course the field ordering would have to be
enforced which weakens the utility that name/value pairs that JSON and XML
afford.


On 9/24/12 9:44 AM, "Raffael Marty" <[hidden email]> wrote:

>#3 is completely different from the other semantics. That's a sessionID,
>which should not be in this field. Different field, different discussion.
>I think the idea here is a unique ID. I don't like #1. You can just use
>hash(data) as a UUID. Why require that? I think in general, I am against
>requiring it. I think we need it in general though.
>
>  -raffy
>
>--
>  Raffael Marty
>  ceo @ pixlcloud
>http://pixlcloud.com
>  @raffaelmarty                                          http://raffy.ch
>
>
>
>On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]> wrote:
>
>> Definitely an important question. But, it's almost important to not
>>overthink it.
>>
>> Some of the options I can think of are:
>>
>> 1. UUID. Upside is a guarantee of uniqueness, downside might be
>>generation time.
>> 2. Unique string or integer assigned by producer. This could be
>>implemented as an incrementing ID but no reason to require that in the
>>spec. Basically the producer would just assign unique IDs as a best
>>effort and consumers would need to know that there could be conflicts
>>from different producers.
>> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID
>>since it wouldn't need to be generated as often, but would give the
>>producer some namespace in which to produce unique IDs.
>>
>> Another thing to consider is whether the recordId would be required or
>>optional.
>>
>> John
>> ________________________________________
>> From: Adam Montville [[hidden email]]
>> Sent: Monday, September 24, 2012 12:18 PM
>> To: cee-discussion-list CEE-Related Discussion
>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>event data
>>
>> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
>>
>>> What are the exact semantics of the field? An increasing ID? A UUID?
>>
>>
>> This is a good question.  How do you envision this working between use
>> cases?  We have n collectors receiving events.  For each collector, they
>> are either receiving CEE-formatted events or not.  In the case where
>>they
>> are receiving CEE-formatted events, then what should we expect to see in
>> such a field?  On the other hand, if events are normalized into CEE,
>>then
>> what would we expect to see?
>>
>> I would argue that the latter need not be described in too much detail,
>> but that the former will need some operational support beyond the
>> specification, unless we attempt a globally unique identification
>>scheme,
>> such as using a URI of some sort, which may become burdensome quickly.
>>
>> Caveat with what I've said is that I'm dive bombing into the middle of a
>> discussion - fully aware that there may be information I'm lacking at
>>this
>> point.  Mitch is probably closer to the discussion than I.
>>
>> Regards,
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>>
>>> --
>>> Raffael Marty
>>> ceo @ pixlcloud
>>> http://pixlcloud.com
>>> @raffaelmarty                                          http://raffy.ch
>>>
>>>
>>>
>>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>>>
>>>> Please consider adding it (back).
>>>>
>>>>
>>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>>>
>>>>> Should we consider adding it back? Or if I'm mistaken and CEE has
>>>>>never
>>>>> included it, should we add it?
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Fletcher, Boyd C IV Mr CIV OSD
In reply to this post by Wunder, John A.

John,

 

our concerns about 1.0 are:

 

1) lack of support for language translations for human readable representations

2) Microsoft Event data does not map cleanly into v1.0

3) support for device class profiles

4) lack of support for digitally signed records

 

 

I think it would provide more value to the community to release a robust specification instead of an incomplete one.

 

boyd

 

 

 

 


From: Wunder, John A. [[hidden email]]
Sent: Sunday, September 23, 2012 3:53 PM
To: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event data

So I never really got a lot of response to this.

After our last CEE Board call (almost two weeks ago, I've been late sending this) there was generally agreement that considering such a big list of changes so close to 1.0 release is probably a bad idea. So the board's recommendation was to hold off on these changes until a post-1.0 release and then re-evaluate them one at a time as we find the time and the need.

Do people here agree with that? Does anyone here want to argue for considering one of these changes for the current version?

A related question is why doesn't CEE have a unique event identifier (#3 in the list of changes below) as a required field? I remember seeing it in and older version but it seems to have been removed. Does anyone know the reason for removing it? Should we consider adding it back? Or if I'm mistaken and CEE has never included it, should we add it?

John

>-----Original Message-----
>From: Wunder, John A. [[hidden email]]
>Sent: Wednesday, September 05, 2012 9:14 AM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event
>data
>
>Has anyone had a chance to look at the files that Phil sent out? If you don't
>want to go digging through them, here's an example of an event from their
>proposal:
>
>{
>      "eventReference":{
>         "profileReference":{
>            "uuid":"00112233-1234-5678-9abc-ed0123456789",
>            "version":1
>         },
>         "eventId":1,
>         "version":1
>      },
>      "criticality":"DEBUG",
>      "dateTime":"2001-12-31T12:00:00",
>      "recordId":"1234",
>      "fields":[
>         {
>            "value":"jsmith"
>            "name":"username",
>         },
>         {
>            "name":"hostname",
>            "value":"foo.dod.mil"
>         }
>      ]
> }
>
>As you can see, there are a few differences from CEE JSON:
>
>1. Profiles are referenced using a UUID and version rather than a URI
>2. Profiles include a list of "event classes", which define specific fields that are
>required for specific event types. In contrast, CEE profiles (as currently
>implemented) tend to just provide a list of fields you CAN or MUST use for
>that profile without providing a list of event classes.
>3. Events have a record ID (we've talked about this before but as currently
>implemented we don't require that)
>4. Event fields are contained in a "fields" array with "name"/"value" key/value
>pairs rather than standard JSON key/value pairs.
>
>Thoughts?
>
>Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and prefer CEE
>to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
>1. Why use a UUID rather than a URI? The advantages of a URI are that you
>can make it a resolveable URL for ease of use and it provides more human-
>readable profile references (i.e. in CEE I can tell what a profile is by looking at
>the URI, a UUID is transparent unless you know where to go to find it).
>4. Why use the "fields" structure? I realize we've talked about this before, I
>just want to get the explanation on the list for everyone else to see.
>
>John
>
>-----Original Message-----
>From: Wunder, John A. [[hidden email]]
>Sent: Thursday, August 30, 2012 3:36 PM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>I've been working with Phil and the JALoP team to help them understand CEE
>and where we're coming from as well as to understand more about where
>they're coming from. So at this point I'm thinking I have a pretty good
>understanding of the differences.
>
>In my opinion the most important distinction is that we have different focus
>areas. JALoP is focusing on syntax and (outside of the audit schema) the
>transport. CEE has the syntax component but also has more of a focus on
>defining the event taxonomy and core field dictionary. So to that extent if we
>move towards the JALoP schemas our existing work on the taxonomy and
>field dictionary wouldn't be significantly affected. The caveat is that we'll have
>to go back and forth a little bit on how the taxonomy ends up being
>represented, since although their concept is kind of a taxonomy it's
>implemented very differently from ours.
>
>In the syntax specifically, the main philosophical difference is that CEE's
>primary focus has been JSON and lightweight uses while JALoPs has been XML
>and more "heavy" uses (that require validation, integrity, etc). That ends up
>being reflected in some of the structures that we use. The proposal Phil has
>here includes both JSON and XML, just like CEE does, but because it started as
>XML the JSON is influenced a little by that. Take a look at what he sent (the
>.piz file is a .zip file w/ the schemas) and compare it to what we have now.
>
>I think it's unlikely that we'd end up doing a full replacement of the CEE syntax
>with the JALoP audit syntax, although of course that's definitely an option. But
>in any case we should look at what's out there and take the best pieces of
>each. If it would help I could set up a telecom where Phil and the team can
>explain what they've done and you all can ask questions.
>
>I'd especially want to get the opinion of event record PRODUCERS here, since
>they're the one that will be using the syntax.
>
>John
>
>-----Original Message-----
>From: Philip Black-Knight [[hidden email]]
>Sent: Thursday, August 30, 2012 1:17 PM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>I'm working on a government funded project (JALoP). The primary goals are to
>author & implement a protocol to reliably transfer what we refer to as journal,
>audit, and log data. Much of our efforts are related to what CEE labels the CLT.
>
>However, we've also done some work (i.e. the schemas I just sent out)
>related to describing log data in a strict format. Because our schemas are
>similar (and have similar goals) we were hoping that it would be possible to
>combine our efforts.
>
>We feel that our schemas are a lot simpler and easier to understand than the
>current CEE approach, and would like to see the CEE format transformed into
>a something a little simpler and easier to follow.
>
>-Phil
>
>-----Original Message-----
>From: Raffael Marty [[hidden email]]
>Sent: Thursday, August 30, 2012 12:12 PM
>To: Philip Black-Knight
>Cc: [hidden email]
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>Hi Phil,
>
>When you say "we have been ...", who is we?
>
>The goals you list are pretty much the ones we are addressing with CEE. Glad
>we are all on the same page. Would love to hear your specific input on CEE.
>
>Thanks
>
>  -raffy
>
>--
>  Raffael Marty
>  @raffaelmarty                                          http://raffy.ch
>
>On Aug 30, 2012, at 5:15 AM, Philip Black-Knight <[hidden email]>
>wrote:
>
>> We've been working on a similar effort for describing an enumerating
>events. Our architecture is comprised of 3 distinct documents:
>>
>> -       EventProfileList (profiles.xsd) - An EventProfileList is an enumeration of
>events for a class of device (e.g. router, Cisco router, UNIX system, Solaris).
>The EventProfileList can be for a generic device (e.g. router, switch, UNIX
>system) or specific to a vender/product (e.g. Cisco Router, Juniper switch,
>Oracle Solaris). This is similar to how SNMP works. The type for each field is
>recorded in the EventProfile, and would be one of the existing CEE dictionary
>types.
>>
>> -       Events (events.xsd) - An Events document is a listing of one or more
>event instances (called records). Each record references an Event from a
>specific EventProfile.
>>
>> -       Translations (translations.xsd) - A Translation document provides a
>human readable, language specific, version of the events enumerated in an
>EventProfileList. To display the details of an Event, you would use a record
>from an Events document, combined with the translation from a Translations
>document, to render a human readable string in the user's language of choice.
>>
>> For example, an EventProfileList could include an event that indicates a user
>logged on to a particular system. In the EventProfileList, the descriptor for this
>login event would indicate the required fields: user_name, hostname, and tty.
>When a login event occurs, the system would record the event in an Events
>document, including the values for these fields. A Translations document (for
>US English) might include the format string: "%user_name% logged on to
>%hostname% on %tty% at %timestamp%". Using this format string, and the
>event record from the Events document, a human readable string can be
>rendered for the user.
>>
>> One of the primary goals is to move away from loosely formatted event
>messages which are hard to parse, and can change from one version of
>software package to the next. Using a common format to enumerate event
>records makes processing and correlating data simpler. This is all part of a
>larger effort to define data formats and protocols for reliably transferring
>Journal/Audit/Logging events. Additionally, we'd like to be able to align
>various software products that perform similar tasks (i.e. routers, or network
>switches, etc) to all use a common set of events so that a router built by Cisco
>& a router built by Juniper Networks would both generate the same events.
>Each vendor would be free to augment their list of events (by providing their
>own EventProfileLists) and would be able to define new events, and/or
>extend existing events.
>>
>> Existing syslog data is easily recorded using this approach. The EventProfile
>for syslog data indicates the required fields for msg (the actual syslog
>message), facility, priority, hostname, applicationName, etc that are part of
>standard syslog messages. The attached zip includes an EventProfileList &
>sample Events document for syslog.
>>
>>
>> Another important goal was to support existing logging systems. While
>creating these schemas, we studied Syslog, Windows Event Structure, Solaris
>Audit, and Linux Audit and believe that all can be represented in our Event
>structures with no loss of information or context.
>>
>> Other features:
>>
>> -       Support optional digital signatures of all XML files to meet government
>requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit log
>integrity and non-repudiation
>>
>> -       Updates to language translations (both new languages and to fix errors
>in translations) can be added without the need to update a single line of code.
>>
>> -       Both the EventProfileLists and the Event descriptors within the
>EventProfile list are versioned, allowing event records & translations to target
>a specific version of an event.
>>
>> -       Documents containing Event records may be in XML or JSON format
>>
>> -       Limited support for Event inheritance. Each Event descriptor can inherit
>from some other Event descriptor (multiple inheritance is not supported). For
>example, a generic profile could exist that describes a "logon" event, and
>specifies the required fields username, hostname, and time. The
>EventProfileList for Windows would identify a login event that inherits from
>this base, and adds the fields first_name, last_name, guid, etc. The
>EventProfileList for Linux would add fields for uid, tty, etc. When a Linux Logon
>event is recorded, in addition to the uid and tty, it must also specify the fields
>from the inherited class (i.e. username, hostname, and time). This is similar to
>current CEE approach to taxonomies.
>>
>>
>> To further support this effort, we've been working on a network
>> protocol to reliably and securely transfer Event records off box. These
>protocols are designed to support raw XML as well as compressed formats
>such as Efficient XML Interchange Attached are a draft set of schemas and
>sample documents (it is just a renamed zip file).
>>
>> -Phil
>>
>> <eventSchemas.1814d42.piz>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

zrlram
> our concerns about 1.0 are:
>  
> 1) lack of support for language translations for human readable representations

The goal of CEE from the beginning was to have a machine-processable standard. May I ask, how would you use this? What does human readable mean? Like actual English sentences?

> 2) Microsoft Event data does not map cleanly into v1.0

That's a big concern. Could you elaborate on that?

> 3) support for device class profiles

Not sure I understand what you mean by this? device classes?

> 4) lack of support for digitally signed records

This is a property that would be part of the transport standard. The nice thing about how CEE is structured is that everything is very modular. It would be fairly simple to add signatures and hash chaining as modules to the transport at some point. However, I don't see signatures on records as a huge demand in the market.

Right now, with CEE, we are trying to address the biggest problems: syntax, dictionary, and a taxonomy. Those three things will get us 80% the way to machine interoperability...

Thanks

  -raffy

>  
>  
> I think it would provide more value to the community to release a robust specification instead of an incomplete one.
>  
> boyd
>  
>  
>  
>  
> From: Wunder, John A. [[hidden email]]
> Sent: Sunday, September 23, 2012 3:53 PM
> To: [hidden email]
> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event data
>
> So I never really got a lot of response to this.
>
> After our last CEE Board call (almost two weeks ago, I've been late sending this) there was generally agreement that considering such a big list of changes so close to 1.0 release is probably a bad idea. So the board's recommendation was to hold off on these changes until a post-1.0 release and then re-evaluate them one at a time as we find the time and the need.
>
> Do people here agree with that? Does anyone here want to argue for considering one of these changes for the current version?
>
> A related question is why doesn't CEE have a unique event identifier (#3 in the list of changes below) as a required field? I remember seeing it in and older version but it seems to have been removed. Does anyone know the reason for removing it? Should we consider adding it back? Or if I'm mistaken and CEE has never included it, should we add it?
>
> John
>
> >-----Original Message-----
> >From: Wunder, John A. [mailto:[hidden email]]
> >Sent: Wednesday, September 05, 2012 9:14 AM
> >To: cee-discussion-list CEE-Related Discussion
> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event
> >data
> >
> >Has anyone had a chance to look at the files that Phil sent out? If you don't
> >want to go digging through them, here's an example of an event from their
> >proposal:
> >
> >{
> >      "eventReference":{
> >         "profileReference":{
> >            "uuid":"00112233-1234-5678-9abc-ed0123456789",
> >            "version":1
> >         },
> >         "eventId":1,
> >         "version":1
> >      },
> >      "criticality":"DEBUG",
> >      "dateTime":"2001-12-31T12:00:00",
> >      "recordId":"1234",
> >      "fields":[
> >         {
> >            "value":"jsmith"
> >            "name":"username",
> >         },
> >         {
> >            "name":"hostname",
> >            "value":"foo.dod.mil"
> >         }
> >      ]
> > }
> >
> >As you can see, there are a few differences from CEE JSON:
> >
> >1. Profiles are referenced using a UUID and version rather than a URI
> >2. Profiles include a list of "event classes", which define specific fields that are
> >required for specific event types. In contrast, CEE profiles (as currently
> >implemented) tend to just provide a list of fields you CAN or MUST use for
> >that profile without providing a list of event classes.
> >3. Events have a record ID (we've talked about this before but as currently
> >implemented we don't require that)
> >4. Event fields are contained in a "fields" array with "name"/"value" key/value
> >pairs rather than standard JSON key/value pairs.
> >
> >Thoughts?
> >
> >Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and prefer CEE
> >to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
> >1. Why use a UUID rather than a URI? The advantages of a URI are that you
> >can make it a resolveable URL for ease of use and it provides more human-
> >readable profile references (i.e. in CEE I can tell what a profile is by looking at
> >the URI, a UUID is transparent unless you know where to go to find it).
> >4. Why use the "fields" structure? I realize we've talked about this before, I
> >just want to get the explanation on the list for everyone else to see.
> >
> >John
> >
> >-----Original Message-----
> >From: Wunder, John A. [mailto:[hidden email]]
> >Sent: Thursday, August 30, 2012 3:36 PM
> >To: cee-discussion-list CEE-Related Discussion
> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
> >data
> >
> >I've been working with Phil and the JALoP team to help them understand CEE
> >and where we're coming from as well as to understand more about where
> >they're coming from. So at this point I'm thinking I have a pretty good
> >understanding of the differences.
> >
> >In my opinion the most important distinction is that we have different focus
> >areas. JALoP is focusing on syntax and (outside of the audit schema) the
> >transport. CEE has the syntax component but also has more of a focus on
> >defining the event taxonomy and core field dictionary. So to that extent if we
> >move towards the JALoP schemas our existing work on the taxonomy and
> >field dictionary wouldn't be significantly affected. The caveat is that we'll have
> >to go back and forth a little bit on how the taxonomy ends up being
> >represented, since although their concept is kind of a taxonomy it's
> >implemented very differently from ours.
> >
> >In the syntax specifically, the main philosophical difference is that CEE's
> >primary focus has been JSON and lightweight uses while JALoPs has been XML
> >and more "heavy" uses (that require validation, integrity, etc). That ends up
> >being reflected in some of the structures that we use. The proposal Phil has
> >here includes both JSON and XML, just like CEE does, but because it started as
> >XML the JSON is influenced a little by that. Take a look at what he sent (the
> >.piz file is a .zip file w/ the schemas) and compare it to what we have now.
> >
> >I think it's unlikely that we'd end up doing a full replacement of the CEE syntax
> >with the JALoP audit syntax, although of course that's definitely an option. But
> >in any case we should look at what's out there and take the best pieces of
> >each. If it would help I could set up a telecom where Phil and the team can
> >explain what they've done and you all can ask questions.
> >
> >I'd especially want to get the opinion of event record PRODUCERS here, since
> >they're the one that will be using the syntax.
> >
> >John
> >
> >-----Original Message-----
> >From: Philip Black-Knight [mailto:[hidden email]]
> >Sent: Thursday, August 30, 2012 1:17 PM
> >To: cee-discussion-list CEE-Related Discussion
> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
> >data
> >
> >I'm working on a government funded project (JALoP). The primary goals are to
> >author & implement a protocol to reliably transfer what we refer to as journal,
> >audit, and log data. Much of our efforts are related to what CEE labels the CLT.
> >
> >However, we've also done some work (i.e. the schemas I just sent out)
> >related to describing log data in a strict format. Because our schemas are
> >similar (and have similar goals) we were hoping that it would be possible to
> >combine our efforts.
> >
> >We feel that our schemas are a lot simpler and easier to understand than the
> >current CEE approach, and would like to see the CEE format transformed into
> >a something a little simpler and easier to follow.
> >
> >-Phil
> >
> >-----Original Message-----
> >From: Raffael Marty [mailto:[hidden email]]
> >Sent: Thursday, August 30, 2012 12:12 PM
> >To: Philip Black-Knight
> >Cc: [hidden email]
> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
> >data
> >
> >Hi Phil,
> >
> >When you say "we have been ...", who is we?
> >
> >The goals you list are pretty much the ones we are addressing with CEE. Glad
> >we are all on the same page. Would love to hear your specific input on CEE.
> >
> >Thanks
> >
> >  -raffy
> >
> >--
> >  Raffael Marty
> >  @raffaelmarty                                          http://raffy.ch
> >
> >On Aug 30, 2012, at 5:15 AM, Philip Black-Knight <[hidden email]>
> >wrote:
> >
> >> We've been working on a similar effort for describing an enumerating
> >events. Our architecture is comprised of 3 distinct documents:
> >>
> >> -       EventProfileList (profiles.xsd) - An EventProfileList is an enumeration of
> >events for a class of device (e.g. router, Cisco router, UNIX system, Solaris).
> >The EventProfileList can be for a generic device (e.g. router, switch, UNIX
> >system) or specific to a vender/product (e.g. Cisco Router, Juniper switch,
> >Oracle Solaris). This is similar to how SNMP works. The type for each field is
> >recorded in the EventProfile, and would be one of the existing CEE dictionary
> >types.
> >>
> >> -       Events (events.xsd) - An Events document is a listing of one or more
> >event instances (called records). Each record references an Event from a
> >specific EventProfile.
> >>
> >> -       Translations (translations.xsd) - A Translation document provides a
> >human readable, language specific, version of the events enumerated in an
> >EventProfileList. To display the details of an Event, you would use a record
> >from an Events document, combined with the translation from a Translations
> >document, to render a human readable string in the user's language of choice.
> >>
> >> For example, an EventProfileList could include an event that indicates a user
> >logged on to a particular system. In the EventProfileList, the descriptor for this
> >login event would indicate the required fields: user_name, hostname, and tty.
> >When a login event occurs, the system would record the event in an Events
> >document, including the values for these fields. A Translations document (for
> >US English) might include the format string: "%user_name% logged on to
> >%hostname% on %tty% at %timestamp%". Using this format string, and the
> >event record from the Events document, a human readable string can be
> >rendered for the user.
> >>
> >> One of the primary goals is to move away from loosely formatted event
> >messages which are hard to parse, and can change from one version of
> >software package to the next. Using a common format to enumerate event
> >records makes processing and correlating data simpler. This is all part of a
> >larger effort to define data formats and protocols for reliably transferring
> >Journal/Audit/Logging events. Additionally, we'd like to be able to align
> >various software products that perform similar tasks (i.e. routers, or network
> >switches, etc) to all use a common set of events so that a router built by Cisco
> >& a router built by Juniper Networks would both generate the same events.
> >Each vendor would be free to augment their list of events (by providing their
> >own EventProfileLists) and would be able to define new events, and/or
> >extend existing events.
> >>
> >> Existing syslog data is easily recorded using this approach. The EventProfile
> >for syslog data indicates the required fields for msg (the actual syslog
> >message), facility, priority, hostname, applicationName, etc that are part of
> >standard syslog messages. The attached zip includes an EventProfileList &
> >sample Events document for syslog.
> >>
> >>
> >> Another important goal was to support existing logging systems. While
> >creating these schemas, we studied Syslog, Windows Event Structure, Solaris
> >Audit, and Linux Audit and believe that all can be represented in our Event
> >structures with no loss of information or context.
> >>
> >> Other features:
> >>
> >> -       Support optional digital signatures of all XML files to meet government
> >requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit log
> >integrity and non-repudiation
> >>
> >> -       Updates to language translations (both new languages and to fix errors
> >in translations) can be added without the need to update a single line of code.
> >>
> >> -       Both the EventProfileLists and the Event descriptors within the
> >EventProfile list are versioned, allowing event records & translations to target
> >a specific version of an event.
> >>
> >> -       Documents containing Event records may be in XML or JSON format
> >>
> >> -       Limited support for Event inheritance. Each Event descriptor can inherit
> >from some other Event descriptor (multiple inheritance is not supported). For
> >example, a generic profile could exist that describes a "logon" event, and
> >specifies the required fields username, hostname, and time. The
> >EventProfileList for Windows would identify a login event that inherits from
> >this base, and adds the fields first_name, last_name, guid, etc. The
> >EventProfileList for Linux would add fields for uid, tty, etc. When a Linux Logon
> >event is recorded, in addition to the uid and tty, it must also specify the fields
> >from the inherited class (i.e. username, hostname, and time). This is similar to
> >current CEE approach to taxonomies.
> >>
> >>
> >> To further support this effort, we've been working on a network
> >> protocol to reliably and securely transfer Event records off box. These
> >protocols are designed to support raw XML as well as compressed formats
> >such as Efficient XML Interchange Attached are a draft set of schemas and
> >sample documents (it is just a renamed zip file).
> >>
> >> -Phil
> >>
> >> <eventSchemas.1814d42.piz>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

william.leroy
In reply to this post by Wunder, John A.
Dear All
I believe the uuid concept is need 
Does this include a unique event ID 
From each source or event management server


On Sunday, September 23, 2012, Wunder, John A. wrote:
So I never really got a lot of response to this.

After our last CEE Board call (almost two weeks ago, I've been late sending this) there was generally agreement that considering such a big list of changes so close to 1.0 release is probably a bad idea. So the board's recommendation was to hold off on these changes until a post-1.0 release and then re-evaluate them one at a time as we find the time and the need.

Do people here agree with that? Does anyone here want to argue for considering one of these changes for the current version?

A related question is why doesn't CEE have a unique event identifier (#3 in the list of changes below) as a required field? I remember seeing it in and older version but it seems to have been removed. Does anyone know the reason for removing it? Should we consider adding it back? Or if I'm mistaken and CEE has never included it, should we add it?

John

>-----Original Message-----
>From: Wunder, John A. [mailto:<a href="javascript:;" onclick="_e(event, &#39;cvml&#39;, &#39;jwunder@MITRE.ORG&#39;)">jwunder@...]
>Sent: Wednesday, September 05, 2012 9:14 AM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe event
>data
>
>Has anyone had a chance to look at the files that Phil sent out? If you don't
>want to go digging through them, here's an example of an event from their
>proposal:
>
>{
>      "eventReference":{
>         "profileReference":{
>            "uuid":"00112233-1234-5678-9abc-ed0123456789",
>            "version":1
>         },
>         "eventId":1,
>         "version":1
>      },
>      "criticality":"DEBUG",
>      "dateTime":"2001-12-31T12:00:00",
>      "recordId":"1234",
>      "fields":[
>         {
>            "value":"jsmith"
>            "name":"username",
>         },
>         {
>            "name":"hostname",
>            "value":"foo.dod.mil"
>         }
>      ]
> }
>
>As you can see, there are a few differences from CEE JSON:
>
>1. Profiles are referenced using a UUID and version rather than a URI
>2. Profiles include a list of "event classes", which define specific fields that are
>required for specific event types. In contrast, CEE profiles (as currently
>implemented) tend to just provide a list of fields you CAN or MUST use for
>that profile without providing a list of event classes.
>3. Events have a record ID (we've talked about this before but as currently
>implemented we don't require that)
>4. Event fields are contained in a "fields" array with "name"/"value" key/value
>pairs rather than standard JSON key/value pairs.
>
>Thoughts?
>
>Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and prefer CEE
>to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
>1. Why use a UUID rather than a URI? The advantages of a URI are that you
>can make it a resolveable URL for ease of use and it provides more human-
>readable profile references (i.e. in CEE I can tell what a profile is by looking at
>the URI, a UUID is transparent unless you know where to go to find it).
>4. Why use the "fields" structure? I realize we've talked about this before, I
>just want to get the explanation on the list for everyone else to see.
>
>John
>
>-----Original Message-----
>From: Wunder, John A. [mailto:[hidden email]]
>Sent: Thursday, August 30, 2012 3:36 PM
>To: cee-discussion-list CEE-Related Discussion
>Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe event
>data
>
>I've been working with Phil and the JALoP team to help them understand CEE
>and where we're coming from as well as to understand more about where
>they're coming from. So at this point I'm thinking I have a pretty good
>understanding of the differences.
>
>In my opinion the most important distinction is that we have different focus
>areas. JALoP is focusing on syntax and (outside of the audit schema) the
>transport. CEE has the syntax component but also has more of a focus on
>defining the event taxonomy and core field dictionary. So to that extent if we
>move towards the JALoP schemas our existing work on the taxonomy and
>field dictionary wouldn't be significantly affected. The caveat is that we'll have
>to go back and forth a little bit on how the taxonomy ends up being
>represented, since although their concept is kind of a taxonomy it's
>implemented very differently from ours.
>
>In the syntax specifically, the main philosophical difference is that CEE's
>primary focus has been JSON and lightweight uses while JALoPs has been XML
>and more "heavy" uses (that require validation, integrity, etc). That ends up
>being reflected in some of the structures that we use.


--
Bill LeRoy


[hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Fletcher, Boyd C IV Mr CIV OSD
In reply to this post by zrlram
Comments below.



On 9/24/12 6:05 PM, "Raffael Marty" <[hidden email]> wrote:

>> our concerns about 1.0 are:
>>  
>> 1) lack of support for language translations for human readable
>>representations
>
>The goal of CEE from the beginning was to have a machine-processable
>standard. May I ask, how would you use this? What does human readable
>mean? Like actual English sentences?

Yes, english, german, japanese, etc... sentences that are mapped via an
event ID and profile ID to the raw data being stored and/or sent over the
wire.

Machine processable must be tied to human readable version otherwise you
lose meaning during analysis.

We do a huge amount of automated and semi-automated log analysis. Having
both the raw data and the human readable version is critical to effective
understanding of the data.


>
>> 2) Microsoft Event data does not map cleanly into v1.0
>
>That's a big concern. Could you elaborate on that?

Its dynamically extensible and it supports binary data. It also has the
concept of profiles.

>
>> 3) support for device class profiles
>
>Not sure I understand what you mean by this? device classes?


Like SNMP profiles for routers, switches, operating systems, applications,
etcŠ

For example, CISCO routers support both the basic IETF Router MIB and the
CISCO specific MIB.

We think audit events should work the same way.

For example for RHEL,

There would be a base O/S profile, that would be extended by  a POSIX Unix
profile, then the Linux profile

Base O/S Profile: USER logged in at TIME
Unix Profile: USER logged in at TIME on HOSTNAME via TTY
Linux Profile: USER logged in at TIME on HOSTNAME via TTY with
SELINUX_CONTEXT

USER, TIME, HOSTNAME, TTY,and SELINUX_CONTEXT would the data sent over
wire and it would be tied to a event ID which is located in the Profile.

So in the Linux case, you would have an eventRecord list:

RecordID=102848     // some machine generated unique idea for that
instantiation of the event
EventID=2284  // the Linux Login event ID
ProfileID=CEE-RHEL6   // some type of profile ID (UUID, URI, etcŠ) that
references a profile (event class) that  contains the events for that
system/application

USER=boyd
TIME=23:29:00 24SEP2012
TTY=ttya
SELINUX_CONTEXT=admin_user_t


While the translations file would contain something like for eventID 2284
in the english language:

<eventTranslation eventID="2284" lang="en">
%USER% logged in at %TIME% on %HOSTNAME% vi %TTY% with %SELINUX_CONTEXT%
</eventTranslation>

A renderer would use variable substitution to  generate output like:

"boyd logged in at 23:29:00 24SEP2012 via tty a with admin_user_t"



Analytics Software would use the Profile, eventIDs, Dictionary, while an
audit viewer application would map the values to the human readable
translation.

The XML files that Phil sent have some good examples of how this would
work.



>
>> 4) lack of support for digitally signed records
>
>This is a property that would be part of the transport standard. The nice
>thing about how CEE is structured is that everything is very modular. It
>would be fairly simple to add signatures and hash chaining as modules to
>the transport at some point. However, I don't see signatures on records
>as a huge demand in the market.

Transport only will not work for the legal requirements (HIPPA,
Sarbanes-Oxley, etcŠ) that the O/S and applications vendors are being
asked to support. Transport security (e.g. TLS) only guarantees the
transmission wasn't corrupted not when/where/whom/how the audit record was
generated.

The o/s and applications vendors are under a lot of legal (gov. regs) and
financial community pressure to support signed audit records.

>
>Right now, with CEE, we are trying to address the biggest problems:
>syntax, dictionary, and a taxonomy. Those three things will get us 80%
>the way to machine interoperability...


We don't believe the current Syntax (especially for XML) is sufficient to
meet the demands of modern operating systems and applications. Most of the
CEE structure that exists today appears focus on Unix/Linux Audit,
Routers, and Switches. After discussing this with John Wunder, he agreed
the current approach was focused on those use cases.

Did you look at the XML files that Phil sent?



>
>Thanks
>
>  -raffy
>
>>  
>>  
>> I think it would provide more value to the community to release a
>>robust specification instead of an incomplete one.
>>  
>> boyd
>>  
>>  
>>  
>>  
>> From: Wunder, John A. [[hidden email]]
>> Sent: Sunday, September 23, 2012 3:53 PM
>> To: [hidden email]
>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>event data
>>
>> So I never really got a lot of response to this.
>>
>> After our last CEE Board call (almost two weeks ago, I've been late
>>sending this) there was generally agreement that considering such a big
>>list of changes so close to 1.0 release is probably a bad idea. So the
>>board's recommendation was to hold off on these changes until a post-1.0
>>release and then re-evaluate them one at a time as we find the time and
>>the need.
>>
>> Do people here agree with that? Does anyone here want to argue for
>>considering one of these changes for the current version?
>>
>> A related question is why doesn't CEE have a unique event identifier
>>(#3 in the list of changes below) as a required field? I remember seeing
>>it in and older version but it seems to have been removed. Does anyone
>>know the reason for removing it? Should we consider adding it back? Or
>>if I'm mistaken and CEE has never included it, should we add it?
>>
>> John
>>
>> >-----Original Message-----
>> >From: Wunder, John A. [mailto:[hidden email]]
>> >Sent: Wednesday, September 05, 2012 9:14 AM
>> >To: cee-discussion-list CEE-Related Discussion
>> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>event
>> >data
>> >
>> >Has anyone had a chance to look at the files that Phil sent out? If
>>you don't
>> >want to go digging through them, here's an example of an event from
>>their
>> >proposal:
>> >
>> >{
>> >      "eventReference":{
>> >         "profileReference":{
>> >            "uuid":"00112233-1234-5678-9abc-ed0123456789",
>> >            "version":1
>> >         },
>> >         "eventId":1,
>> >         "version":1
>> >      },
>> >      "criticality":"DEBUG",
>> >      "dateTime":"2001-12-31T12:00:00",
>> >      "recordId":"1234",
>> >      "fields":[
>> >         {
>> >            "value":"jsmith"
>> >            "name":"username",
>> >         },
>> >         {
>> >            "name":"hostname",
>> >            "value":"foo.dod.mil"
>> >         }
>> >      ]
>> > }
>> >
>> >As you can see, there are a few differences from CEE JSON:
>> >
>> >1. Profiles are referenced using a UUID and version rather than a URI
>> >2. Profiles include a list of "event classes", which define specific
>>fields that are
>> >required for specific event types. In contrast, CEE profiles (as
>>currently
>> >implemented) tend to just provide a list of fields you CAN or MUST use
>>for
>> >that profile without providing a list of event classes.
>> >3. Events have a record ID (we've talked about this before but as
>>currently
>> >implemented we don't require that)
>> >4. Event fields are contained in a "fields" array with "name"/"value"
>>key/value
>> >pairs rather than standard JSON key/value pairs.
>> >
>> >Thoughts?
>> >
>> >Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE, and
>>prefer CEE
>> >to 4. Phil, can you give some explanation behind 1 and 4? Specifically:
>> >1. Why use a UUID rather than a URI? The advantages of a URI are that
>>you
>> >can make it a resolveable URL for ease of use and it provides more
>>human-
>> >readable profile references (i.e. in CEE I can tell what a profile is
>>by looking at
>> >the URI, a UUID is transparent unless you know where to go to find it).
>> >4. Why use the "fields" structure? I realize we've talked about this
>>before, I
>> >just want to get the explanation on the list for everyone else to see.
>> >
>> >John
>> >
>> >-----Original Message-----
>> >From: Wunder, John A. [mailto:[hidden email]]
>> >Sent: Thursday, August 30, 2012 3:36 PM
>> >To: cee-discussion-list CEE-Related Discussion
>> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe
>>event
>> >data
>> >
>> >I've been working with Phil and the JALoP team to help them understand
>>CEE
>> >and where we're coming from as well as to understand more about where
>> >they're coming from. So at this point I'm thinking I have a pretty good
>> >understanding of the differences.
>> >
>> >In my opinion the most important distinction is that we have different
>>focus
>> >areas. JALoP is focusing on syntax and (outside of the audit schema)
>>the
>> >transport. CEE has the syntax component but also has more of a focus on
>> >defining the event taxonomy and core field dictionary. So to that
>>extent if we
>> >move towards the JALoP schemas our existing work on the taxonomy and
>> >field dictionary wouldn't be significantly affected. The caveat is
>>that we'll have
>> >to go back and forth a little bit on how the taxonomy ends up being
>> >represented, since although their concept is kind of a taxonomy it's
>> >implemented very differently from ours.
>> >
>> >In the syntax specifically, the main philosophical difference is that
>>CEE's
>> >primary focus has been JSON and lightweight uses while JALoPs has been
>>XML
>> >and more "heavy" uses (that require validation, integrity, etc). That
>>ends up
>> >being reflected in some of the structures that we use. The proposal
>>Phil has
>> >here includes both JSON and XML, just like CEE does, but because it
>>started as
>> >XML the JSON is influenced a little by that. Take a look at what he
>>sent (the
>> >.piz file is a .zip file w/ the schemas) and compare it to what we
>>have now.
>> >
>> >I think it's unlikely that we'd end up doing a full replacement of the
>>CEE syntax
>> >with the JALoP audit syntax, although of course that's definitely an
>>option. But
>> >in any case we should look at what's out there and take the best
>>pieces of
>> >each. If it would help I could set up a telecom where Phil and the
>>team can
>> >explain what they've done and you all can ask questions.
>> >
>> >I'd especially want to get the opinion of event record PRODUCERS here,
>>since
>> >they're the one that will be using the syntax.
>> >
>> >John
>> >
>> >-----Original Message-----
>> >From: Philip Black-Knight [mailto:[hidden email]]
>> >Sent: Thursday, August 30, 2012 1:17 PM
>> >To: cee-discussion-list CEE-Related Discussion
>> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe
>>event
>> >data
>> >
>> >I'm working on a government funded project (JALoP). The primary goals
>>are to
>> >author & implement a protocol to reliably transfer what we refer to as
>>journal,
>> >audit, and log data. Much of our efforts are related to what CEE
>>labels the CLT.
>> >
>> >However, we've also done some work (i.e. the schemas I just sent out)
>> >related to describing log data in a strict format. Because our schemas
>>are
>> >similar (and have similar goals) we were hoping that it would be
>>possible to
>> >combine our efforts.
>> >
>> >We feel that our schemas are a lot simpler and easier to understand
>>than the
>> >current CEE approach, and would like to see the CEE format transformed
>>into
>> >a something a little simpler and easier to follow.
>> >
>> >-Phil
>> >
>> >-----Original Message-----
>> >From: Raffael Marty [mailto:[hidden email]]
>> >Sent: Thursday, August 30, 2012 12:12 PM
>> >To: Philip Black-Knight
>> >Cc: [hidden email]
>> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally discribe
>>event
>> >data
>> >
>> >Hi Phil,
>> >
>> >When you say "we have been ...", who is we?
>> >
>> >The goals you list are pretty much the ones we are addressing with
>>CEE. Glad
>> >we are all on the same page. Would love to hear your specific input on
>>CEE.
>> >
>> >Thanks
>> >
>> >  -raffy
>> >
>> >--
>> >  Raffael Marty
>> >  @raffaelmarty
>>http://raffy.ch
>> >
>> >On Aug 30, 2012, at 5:15 AM, Philip Black-Knight <[hidden email]>
>> >wrote:
>> >
>> >> We've been working on a similar effort for describing an enumerating
>> >events. Our architecture is comprised of 3 distinct documents:
>> >>
>> >> -       EventProfileList (profiles.xsd) - An EventProfileList is an
>>enumeration of
>> >events for a class of device (e.g. router, Cisco router, UNIX system,
>>Solaris).
>> >The EventProfileList can be for a generic device (e.g. router, switch,
>>UNIX
>> >system) or specific to a vender/product (e.g. Cisco Router, Juniper
>>switch,
>> >Oracle Solaris). This is similar to how SNMP works. The type for each
>>field is
>> >recorded in the EventProfile, and would be one of the existing CEE
>>dictionary
>> >types.
>> >>
>> >> -       Events (events.xsd) - An Events document is a listing of one
>>or more
>> >event instances (called records). Each record references an Event from
>>a
>> >specific EventProfile.
>> >>
>> >> -       Translations (translations.xsd) - A Translation document
>>provides a
>> >human readable, language specific, version of the events enumerated in
>>an
>> >EventProfileList. To display the details of an Event, you would use a
>>record
>> >from an Events document, combined with the translation from a
>>Translations
>> >document, to render a human readable string in the user's language of
>>choice.
>> >>
>> >> For example, an EventProfileList could include an event that
>>indicates a user
>> >logged on to a particular system. In the EventProfileList, the
>>descriptor for this
>> >login event would indicate the required fields: user_name, hostname,
>>and tty.
>> >When a login event occurs, the system would record the event in an
>>Events
>> >document, including the values for these fields. A Translations
>>document (for
>> >US English) might include the format string: "%user_name% logged on to
>> >%hostname% on %tty% at %timestamp%". Using this format string, and the
>> >event record from the Events document, a human readable string can be
>> >rendered for the user.
>> >>
>> >> One of the primary goals is to move away from loosely formatted event
>> >messages which are hard to parse, and can change from one version of
>> >software package to the next. Using a common format to enumerate event
>> >records makes processing and correlating data simpler. This is all
>>part of a
>> >larger effort to define data formats and protocols for reliably
>>transferring
>> >Journal/Audit/Logging events. Additionally, we'd like to be able to
>>align
>> >various software products that perform similar tasks (i.e. routers, or
>>network
>> >switches, etc) to all use a common set of events so that a router
>>built by Cisco
>> >& a router built by Juniper Networks would both generate the same
>>events.
>> >Each vendor would be free to augment their list of events (by
>>providing their
>> >own EventProfileLists) and would be able to define new events, and/or
>> >extend existing events.
>> >>
>> >> Existing syslog data is easily recorded using this approach. The
>>EventProfile
>> >for syslog data indicates the required fields for msg (the actual
>>syslog
>> >message), facility, priority, hostname, applicationName, etc that are
>>part of
>> >standard syslog messages. The attached zip includes an
>>EventProfileList &
>> >sample Events document for syslog.
>> >>
>> >>
>> >> Another important goal was to support existing logging systems. While
>> >creating these schemas, we studied Syslog, Windows Event Structure,
>>Solaris
>> >Audit, and Linux Audit and believe that all can be represented in our
>>Event
>> >structures with no loss of information or context.
>> >>
>> >> Other features:
>> >>
>> >> -       Support optional digital signatures of all XML files to meet
>>government
>> >requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit log
>> >integrity and non-repudiation
>> >>
>> >> -       Updates to language translations (both new languages and to
>>fix errors
>> >in translations) can be added without the need to update a single line
>>of code.
>> >>
>> >> -       Both the EventProfileLists and the Event descriptors within
>>the
>> >EventProfile list are versioned, allowing event records & translations
>>to target
>> >a specific version of an event.
>> >>
>> >> -       Documents containing Event records may be in XML or JSON
>>format
>> >>
>> >> -       Limited support for Event inheritance. Each Event descriptor
>>can inherit
>> >from some other Event descriptor (multiple inheritance is not
>>supported). For
>> >example, a generic profile could exist that describes a "logon" event,
>>and
>> >specifies the required fields username, hostname, and time. The
>> >EventProfileList for Windows would identify a login event that
>>inherits from
>> >this base, and adds the fields first_name, last_name, guid, etc. The
>> >EventProfileList for Linux would add fields for uid, tty, etc. When a
>>Linux Logon
>> >event is recorded, in addition to the uid and tty, it must also
>>specify the fields
>> >from the inherited class (i.e. username, hostname, and time). This is
>>similar to
>> >current CEE approach to taxonomies.
>> >>
>> >>
>> >> To further support this effort, we've been working on a network
>> >> protocol to reliably and securely transfer Event records off box.
>>These
>> >protocols are designed to support raw XML as well as compressed formats
>> >such as Efficient XML Interchange Attached are a draft set of schemas
>>and
>> >sample documents (it is just a renamed zip file).
>> >>
>> >> -Phil
>> >>
>> >> <eventSchemas.1814d42.piz>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Rainer Gerhards
In reply to this post by Wunder, John A.
> -----Original Message-----
> From: Wunder, John A. [mailto:[hidden email]]
> Sent: Monday, September 24, 2012 7:03 PM
> To: [hidden email]
> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
> event data
>
> Well my point with #3 is that if you have a sessionId you don't need to
> worry about a globally unique recordId. You just create a session-
> unique recordID and combine it with the sessionId to give you a
> globally unique ID.

But then you must make sure that the session ID is unique to some tuple. Which is it?

Rainer

>
> John
> ________________________________________
> From: Adam Montville [[hidden email]]
> Sent: Monday, September 24, 2012 12:53 PM
> To: cee-discussion-list CEE-Related Discussion
> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
> event data
>
> On 9/24/12 9:44 AM, "Raffael Marty" <[hidden email]> wrote:
>
> >#3 is completely different from the other semantics. That's a
> sessionID,
> >which should not be in this field. Different field, different
> discussion.
>
> Agree.
>
> >I think the idea here is a unique ID. I don't like #1. You can just
> use
> >hash(data) as a UUID. Why require that? I think in general, I am
> against
> >requiring it. I think we need it in general though.
>
> Which hash would you use and what are the precise uniqueness
> requirements?
>  Global in the context of a single enterprise, or global across all
> enterprises?  How many events would that mean we'd be generating IDs
> for
> over what period of time?  Then, what are the collision rates of the
> chosen hash?  It'll probably all work out, but being precise here might
> be
> prudent.
>
> About setting it as a requirement, what are the interoperability
> implications if we don't require the field to be populated?
>
>
> >
> >  -raffy
> >
> >--
> >  Raffael Marty
> >  ceo @ pixlcloud
> >http://pixlcloud.com
> >  @raffaelmarty
> http://raffy.ch
> >
> >
> >
> >On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]>
> wrote:
> >
> >> Definitely an important question. But, it's almost important to not
> >>overthink it.
> >>
> >> Some of the options I can think of are:
> >>
> >> 1. UUID. Upside is a guarantee of uniqueness, downside might be
> >>generation time.
> >> 2. Unique string or integer assigned by producer. This could be
> >>implemented as an incrementing ID but no reason to require that in
> the
> >>spec. Basically the producer would just assign unique IDs as a best
> >>effort and consumers would need to know that there could be conflicts
> >>from different producers.
> >> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID
> >>since it wouldn't need to be generated as often, but would give the
> >>producer some namespace in which to produce unique IDs.
> >>
> >> Another thing to consider is whether the recordId would be required
> or
> >>optional.
> >>
> >> John
> >> ________________________________________
> >> From: Adam Montville [[hidden email]]
> >> Sent: Monday, September 24, 2012 12:18 PM
> >> To: cee-discussion-list CEE-Related Discussion
> >> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> describe
> >>event data
> >>
> >> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
> >>
> >>> What are the exact semantics of the field? An increasing ID? A
> UUID?
> >>
> >>
> >> This is a good question.  How do you envision this working between
> use
> >> cases?  We have n collectors receiving events.  For each collector,
> they
> >> are either receiving CEE-formatted events or not.  In the case where
> >>they
> >> are receiving CEE-formatted events, then what should we expect to
> see in
> >> such a field?  On the other hand, if events are normalized into CEE,
> >>then
> >> what would we expect to see?
> >>
> >> I would argue that the latter need not be described in too much
> detail,
> >> but that the former will need some operational support beyond the
> >> specification, unless we attempt a globally unique identification
> >>scheme,
> >> such as using a URI of some sort, which may become burdensome
> quickly.
> >>
> >> Caveat with what I've said is that I'm dive bombing into the middle
> of a
> >> discussion - fully aware that there may be information I'm lacking
> at
> >>this
> >> point.  Mitch is probably closer to the discussion than I.
> >>
> >> Regards,
> >>
> >> Adam
> >>
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> --
> >>> Raffael Marty
> >>> ceo @ pixlcloud
> >>> http://pixlcloud.com
> >>> @raffaelmarty
> http://raffy.ch
> >>>
> >>>
> >>>
> >>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]>
> wrote:
> >>>
> >>>> Please consider adding it (back).
> >>>>
> >>>>
> >>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
> >>>>
> >>>>> Should we consider adding it back? Or if I'm mistaken and CEE has
> >>>>>never
> >>>>> included it, should we add it?
> >>>
> >>>
> >
> >
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Rainer Gerhards
In reply to this post by Fletcher, Boyd C IV Mr CIV OSD
General comment: wouldn't it make more sense to finish a simple approach before sidetracking it with a more complex one? CEE already took a lot of time, and now we see some interest in the market ... and now we start all over with a new, much more complex thing? I see that CEE will not solve all use-cases, but IMHO it is far better to solve *a lot* (no percentage given by intention) than none (because we wait another 5 years). And if we start over, by the time we will almost have finshed, someone else shows up. This very same story happens for over 15 years...

Detail comments below.

> -----Original Message-----
> From: Fletcher, Boyd C IV Mr CIV OSD [mailto:[hidden email]]
> Sent: Tuesday, September 25, 2012 3:45 AM
> To: [hidden email]
> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
> event data
>
> Comments below.
>
>
>
> On 9/24/12 6:05 PM, "Raffael Marty" <[hidden email]> wrote:
>
> >> our concerns about 1.0 are:
> >>
> >> 1) lack of support for language translations for human readable
> >>representations
> >
> >The goal of CEE from the beginning was to have a machine-processable
> >standard. May I ask, how would you use this? What does human readable
> >mean? Like actual English sentences?
>
> Yes, english, german, japanese, etc... sentences that are mapped via an
> event ID and profile ID to the raw data being stored and/or sent over
> the
> wire.
>
> Machine processable must be tied to human readable version otherwise
> you
> lose meaning during analysis.
>
> We do a huge amount of automated and semi-automated log analysis.
> Having
> both the raw data and the human readable version is critical to
> effective
> understanding of the data.

If the CEE record is based on a profile, you can always assing human-readable text, as the semantic is known. For non-profile or extensions, this is currently not possible. Yes, that's right.
>
>
> >
> >> 2) Microsoft Event data does not map cleanly into v1.0
> >
> >That's a big concern. Could you elaborate on that?
>
> Its dynamically extensible and it supports binary data. It also has the
> concept of profiles.

So what?

>
> >
> >> 3) support for device class profiles
> >
> >Not sure I understand what you mean by this? device classes?
>
>
> Like SNMP profiles for routers, switches, operating systems,
> applications,
> etcŠ
>
> For example, CISCO routers support both the basic IETF Router MIB and
> the
> CISCO specific MIB.
>
> We think audit events should work the same way.
>
> For example for RHEL,
>
> There would be a base O/S profile, that would be extended by  a POSIX
> Unix
> profile, then the Linux profile
>
> Base O/S Profile: USER logged in at TIME
> Unix Profile: USER logged in at TIME on HOSTNAME via TTY
> Linux Profile: USER logged in at TIME on HOSTNAME via TTY with
> SELINUX_CONTEXT
>
> USER, TIME, HOSTNAME, TTY,and SELINUX_CONTEXT would the data sent over
> wire and it would be tied to a event ID which is located in the
> Profile.
>
> So in the Linux case, you would have an eventRecord list:
>
> RecordID=102848     // some machine generated unique idea for that
> instantiation of the event
> EventID=2284  // the Linux Login event ID
> ProfileID=CEE-RHEL6   // some type of profile ID (UUID, URI, etcŠ) that
> references a profile (event class) that  contains the events for that
> system/application
>
> USER=boyd
> TIME=23:29:00 24SEP2012
> TTY=ttya
> SELINUX_CONTEXT=admin_user_t
>
>
> While the translations file would contain something like for eventID
> 2284
> in the english language:
>
> <eventTranslation eventID="2284" lang="en">
> %USER% logged in at %TIME% on %HOSTNAME% vi %TTY% with
> %SELINUX_CONTEXT%
> </eventTranslation>
>
> A renderer would use variable substitution to  generate output like:
>
> "boyd logged in at 23:29:00 24SEP2012 via tty a with admin_user_t"
>
>
>
> Analytics Software would use the Profile, eventIDs, Dictionary, while
> an
> audit viewer application would map the values to the human readable
> translation.
>
Assuming that you write the necessary software, I do not see why you could not do this with CEE.

> The XML files that Phil sent have some good examples of how this would
> work.
>
>
>
> >
> >> 4) lack of support for digitally signed records
> >
> >This is a property that would be part of the transport standard. The
> nice
> >thing about how CEE is structured is that everything is very modular.
> It
> >would be fairly simple to add signatures and hash chaining as modules
> to
> >the transport at some point. However, I don't see signatures on
> records
> >as a huge demand in the market.
>
> Transport only will not work for the legal requirements (HIPPA,
> Sarbanes-Oxley, etcŠ) that the O/S and applications vendors are being
> asked to support. Transport security (e.g. TLS) only guarantees the
> transmission wasn't corrupted not when/where/whom/how the audit record
> was
> generated.
>
> The o/s and applications vendors are under a lot of legal (gov. regs)
> and
> financial community pressure to support signed audit records.

Here I agree: transport signature is insufficient. However, signature can be added to the CEE container, and last year it was planned (at least discussed) to do that in a later revision of the spec.
 

> >
> >Right now, with CEE, we are trying to address the biggest problems:
> >syntax, dictionary, and a taxonomy. Those three things will get us 80%
> >the way to machine interoperability...
>
>
> We don't believe the current Syntax (especially for XML) is sufficient
> to
> meet the demands of modern operating systems and applications. Most of
> the
> CEE structure that exists today appears focus on Unix/Linux Audit,
> Routers, and Switches. After discussing this with John Wunder, he
> agreed
> the current approach was focused on those use cases.

This, and Windows events. Eric Fitzgerald was a long-term member of a board before he departed Microsoft. Until last year, he made sure CEE contains everything Microsoft needed. So this point sounds a bit mood to me.

Rainer

>
> Did you look at the XML files that Phil sent?
>
>
>
> >
> >Thanks
> >
> >  -raffy
> >
> >>
> >>
> >> I think it would provide more value to the community to release a
> >>robust specification instead of an incomplete one.
> >>
> >> boyd
> >>
> >>
> >>
> >>
> >> From: Wunder, John A. [[hidden email]]
> >> Sent: Sunday, September 23, 2012 3:53 PM
> >> To: [hidden email]
> >> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> describe
> >>event data
> >>
> >> So I never really got a lot of response to this.
> >>
> >> After our last CEE Board call (almost two weeks ago, I've been late
> >>sending this) there was generally agreement that considering such a
> big
> >>list of changes so close to 1.0 release is probably a bad idea. So
> the
> >>board's recommendation was to hold off on these changes until a post-
> 1.0
> >>release and then re-evaluate them one at a time as we find the time
> and
> >>the need.
> >>
> >> Do people here agree with that? Does anyone here want to argue for
> >>considering one of these changes for the current version?
> >>
> >> A related question is why doesn't CEE have a unique event identifier
> >>(#3 in the list of changes below) as a required field? I remember
> seeing
> >>it in and older version but it seems to have been removed. Does
> anyone
> >>know the reason for removing it? Should we consider adding it back?
> Or
> >>if I'm mistaken and CEE has never included it, should we add it?
> >>
> >> John
> >>
> >> >-----Original Message-----
> >> >From: Wunder, John A. [mailto:[hidden email]]
> >> >Sent: Wednesday, September 05, 2012 9:14 AM
> >> >To: cee-discussion-list CEE-Related Discussion
> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> describe
> >>event
> >> >data
> >> >
> >> >Has anyone had a chance to look at the files that Phil sent out? If
> >>you don't
> >> >want to go digging through them, here's an example of an event from
> >>their
> >> >proposal:
> >> >
> >> >{
> >> >      "eventReference":{
> >> >         "profileReference":{
> >> >            "uuid":"00112233-1234-5678-9abc-ed0123456789",
> >> >            "version":1
> >> >         },
> >> >         "eventId":1,
> >> >         "version":1
> >> >      },
> >> >      "criticality":"DEBUG",
> >> >      "dateTime":"2001-12-31T12:00:00",
> >> >      "recordId":"1234",
> >> >      "fields":[
> >> >         {
> >> >            "value":"jsmith"
> >> >            "name":"username",
> >> >         },
> >> >         {
> >> >            "name":"hostname",
> >> >            "value":"foo.dod.mil"
> >> >         }
> >> >      ]
> >> > }
> >> >
> >> >As you can see, there are a few differences from CEE JSON:
> >> >
> >> >1. Profiles are referenced using a UUID and version rather than a
> URI
> >> >2. Profiles include a list of "event classes", which define
> specific
> >>fields that are
> >> >required for specific event types. In contrast, CEE profiles (as
> >>currently
> >> >implemented) tend to just provide a list of fields you CAN or MUST
> use
> >>for
> >> >that profile without providing a list of event classes.
> >> >3. Events have a record ID (we've talked about this before but as
> >>currently
> >> >implemented we don't require that)
> >> >4. Event fields are contained in a "fields" array with
> "name"/"value"
> >>key/value
> >> >pairs rather than standard JSON key/value pairs.
> >> >
> >> >Thoughts?
> >> >
> >> >Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE,
> and
> >>prefer CEE
> >> >to 4. Phil, can you give some explanation behind 1 and 4?
> Specifically:
> >> >1. Why use a UUID rather than a URI? The advantages of a URI are
> that
> >>you
> >> >can make it a resolveable URL for ease of use and it provides more
> >>human-
> >> >readable profile references (i.e. in CEE I can tell what a profile
> is
> >>by looking at
> >> >the URI, a UUID is transparent unless you know where to go to find
> it).
> >> >4. Why use the "fields" structure? I realize we've talked about
> this
> >>before, I
> >> >just want to get the explanation on the list for everyone else to
> see.
> >> >
> >> >John
> >> >
> >> >-----Original Message-----
> >> >From: Wunder, John A. [mailto:[hidden email]]
> >> >Sent: Thursday, August 30, 2012 3:36 PM
> >> >To: cee-discussion-list CEE-Related Discussion
> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> discribe
> >>event
> >> >data
> >> >
> >> >I've been working with Phil and the JALoP team to help them
> understand
> >>CEE
> >> >and where we're coming from as well as to understand more about
> where
> >> >they're coming from. So at this point I'm thinking I have a pretty
> good
> >> >understanding of the differences.
> >> >
> >> >In my opinion the most important distinction is that we have
> different
> >>focus
> >> >areas. JALoP is focusing on syntax and (outside of the audit
> schema)
> >>the
> >> >transport. CEE has the syntax component but also has more of a
> focus on
> >> >defining the event taxonomy and core field dictionary. So to that
> >>extent if we
> >> >move towards the JALoP schemas our existing work on the taxonomy
> and
> >> >field dictionary wouldn't be significantly affected. The caveat is
> >>that we'll have
> >> >to go back and forth a little bit on how the taxonomy ends up being
> >> >represented, since although their concept is kind of a taxonomy
> it's
> >> >implemented very differently from ours.
> >> >
> >> >In the syntax specifically, the main philosophical difference is
> that
> >>CEE's
> >> >primary focus has been JSON and lightweight uses while JALoPs has
> been
> >>XML
> >> >and more "heavy" uses (that require validation, integrity, etc).
> That
> >>ends up
> >> >being reflected in some of the structures that we use. The proposal
> >>Phil has
> >> >here includes both JSON and XML, just like CEE does, but because it
> >>started as
> >> >XML the JSON is influenced a little by that. Take a look at what he
> >>sent (the
> >> >.piz file is a .zip file w/ the schemas) and compare it to what we
> >>have now.
> >> >
> >> >I think it's unlikely that we'd end up doing a full replacement of
> the
> >>CEE syntax
> >> >with the JALoP audit syntax, although of course that's definitely
> an
> >>option. But
> >> >in any case we should look at what's out there and take the best
> >>pieces of
> >> >each. If it would help I could set up a telecom where Phil and the
> >>team can
> >> >explain what they've done and you all can ask questions.
> >> >
> >> >I'd especially want to get the opinion of event record PRODUCERS
> here,
> >>since
> >> >they're the one that will be using the syntax.
> >> >
> >> >John
> >> >
> >> >-----Original Message-----
> >> >From: Philip Black-Knight [mailto:[hidden email]]
> >> >Sent: Thursday, August 30, 2012 1:17 PM
> >> >To: cee-discussion-list CEE-Related Discussion
> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> discribe
> >>event
> >> >data
> >> >
> >> >I'm working on a government funded project (JALoP). The primary
> goals
> >>are to
> >> >author & implement a protocol to reliably transfer what we refer to
> as
> >>journal,
> >> >audit, and log data. Much of our efforts are related to what CEE
> >>labels the CLT.
> >> >
> >> >However, we've also done some work (i.e. the schemas I just sent
> out)
> >> >related to describing log data in a strict format. Because our
> schemas
> >>are
> >> >similar (and have similar goals) we were hoping that it would be
> >>possible to
> >> >combine our efforts.
> >> >
> >> >We feel that our schemas are a lot simpler and easier to understand
> >>than the
> >> >current CEE approach, and would like to see the CEE format
> transformed
> >>into
> >> >a something a little simpler and easier to follow.
> >> >
> >> >-Phil
> >> >
> >> >-----Original Message-----
> >> >From: Raffael Marty [mailto:[hidden email]]
> >> >Sent: Thursday, August 30, 2012 12:12 PM
> >> >To: Philip Black-Knight
> >> >Cc: [hidden email]
> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
> discribe
> >>event
> >> >data
> >> >
> >> >Hi Phil,
> >> >
> >> >When you say "we have been ...", who is we?
> >> >
> >> >The goals you list are pretty much the ones we are addressing with
> >>CEE. Glad
> >> >we are all on the same page. Would love to hear your specific input
> on
> >>CEE.
> >> >
> >> >Thanks
> >> >
> >> >  -raffy
> >> >
> >> >--
> >> >  Raffael Marty
> >> >  @raffaelmarty
> >>http://raffy.ch
> >> >
> >> >On Aug 30, 2012, at 5:15 AM, Philip Black-Knight
> <[hidden email]>
> >> >wrote:
> >> >
> >> >> We've been working on a similar effort for describing an
> enumerating
> >> >events. Our architecture is comprised of 3 distinct documents:
> >> >>
> >> >> -       EventProfileList (profiles.xsd) - An EventProfileList is
> an
> >>enumeration of
> >> >events for a class of device (e.g. router, Cisco router, UNIX
> system,
> >>Solaris).
> >> >The EventProfileList can be for a generic device (e.g. router,
> switch,
> >>UNIX
> >> >system) or specific to a vender/product (e.g. Cisco Router, Juniper
> >>switch,
> >> >Oracle Solaris). This is similar to how SNMP works. The type for
> each
> >>field is
> >> >recorded in the EventProfile, and would be one of the existing CEE
> >>dictionary
> >> >types.
> >> >>
> >> >> -       Events (events.xsd) - An Events document is a listing of
> one
> >>or more
> >> >event instances (called records). Each record references an Event
> from
> >>a
> >> >specific EventProfile.
> >> >>
> >> >> -       Translations (translations.xsd) - A Translation document
> >>provides a
> >> >human readable, language specific, version of the events enumerated
> in
> >>an
> >> >EventProfileList. To display the details of an Event, you would use
> a
> >>record
> >> >from an Events document, combined with the translation from a
> >>Translations
> >> >document, to render a human readable string in the user's language
> of
> >>choice.
> >> >>
> >> >> For example, an EventProfileList could include an event that
> >>indicates a user
> >> >logged on to a particular system. In the EventProfileList, the
> >>descriptor for this
> >> >login event would indicate the required fields: user_name,
> hostname,
> >>and tty.
> >> >When a login event occurs, the system would record the event in an
> >>Events
> >> >document, including the values for these fields. A Translations
> >>document (for
> >> >US English) might include the format string: "%user_name% logged on
> to
> >> >%hostname% on %tty% at %timestamp%". Using this format string, and
> the
> >> >event record from the Events document, a human readable string can
> be
> >> >rendered for the user.
> >> >>
> >> >> One of the primary goals is to move away from loosely formatted
> event
> >> >messages which are hard to parse, and can change from one version
> of
> >> >software package to the next. Using a common format to enumerate
> event
> >> >records makes processing and correlating data simpler. This is all
> >>part of a
> >> >larger effort to define data formats and protocols for reliably
> >>transferring
> >> >Journal/Audit/Logging events. Additionally, we'd like to be able to
> >>align
> >> >various software products that perform similar tasks (i.e. routers,
> or
> >>network
> >> >switches, etc) to all use a common set of events so that a router
> >>built by Cisco
> >> >& a router built by Juniper Networks would both generate the same
> >>events.
> >> >Each vendor would be free to augment their list of events (by
> >>providing their
> >> >own EventProfileLists) and would be able to define new events,
> and/or
> >> >extend existing events.
> >> >>
> >> >> Existing syslog data is easily recorded using this approach. The
> >>EventProfile
> >> >for syslog data indicates the required fields for msg (the actual
> >>syslog
> >> >message), facility, priority, hostname, applicationName, etc that
> are
> >>part of
> >> >standard syslog messages. The attached zip includes an
> >>EventProfileList &
> >> >sample Events document for syslog.
> >> >>
> >> >>
> >> >> Another important goal was to support existing logging systems.
> While
> >> >creating these schemas, we studied Syslog, Windows Event Structure,
> >>Solaris
> >> >Audit, and Linux Audit and believe that all can be represented in
> our
> >>Event
> >> >structures with no loss of information or context.
> >> >>
> >> >> Other features:
> >> >>
> >> >> -       Support optional digital signatures of all XML files to
> meet
> >>government
> >> >requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit
> log
> >> >integrity and non-repudiation
> >> >>
> >> >> -       Updates to language translations (both new languages and
> to
> >>fix errors
> >> >in translations) can be added without the need to update a single
> line
> >>of code.
> >> >>
> >> >> -       Both the EventProfileLists and the Event descriptors
> within
> >>the
> >> >EventProfile list are versioned, allowing event records &
> translations
> >>to target
> >> >a specific version of an event.
> >> >>
> >> >> -       Documents containing Event records may be in XML or JSON
> >>format
> >> >>
> >> >> -       Limited support for Event inheritance. Each Event
> descriptor
> >>can inherit
> >> >from some other Event descriptor (multiple inheritance is not
> >>supported). For
> >> >example, a generic profile could exist that describes a "logon"
> event,
> >>and
> >> >specifies the required fields username, hostname, and time. The
> >> >EventProfileList for Windows would identify a login event that
> >>inherits from
> >> >this base, and adds the fields first_name, last_name, guid, etc.
> The
> >> >EventProfileList for Linux would add fields for uid, tty, etc. When
> a
> >>Linux Logon
> >> >event is recorded, in addition to the uid and tty, it must also
> >>specify the fields
> >> >from the inherited class (i.e. username, hostname, and time). This
> is
> >>similar to
> >> >current CEE approach to taxonomies.
> >> >>
> >> >>
> >> >> To further support this effort, we've been working on a network
> >> >> protocol to reliably and securely transfer Event records off box.
> >>These
> >> >protocols are designed to support raw XML as well as compressed
> formats
> >> >such as Efficient XML Interchange Attached are a draft set of
> schemas
> >>and
> >> >sample documents (it is just a renamed zip file).
> >> >>
> >> >> -Phil
> >> >>
> >> >> <eventSchemas.1814d42.piz>
> >
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Fletcher, Boyd C IV Mr CIV OSD
Have you actually reviewed the XML that phil sent? The changes we are
advocating while significant to the XML portion of the spec are not major
changes to the JSON portion. It certainly would not take 5 years to adopt
the changes - more likely a couple months.

And no I think it is better to spend the time to get it right the first
time than to push out a spec that does not meet the needs for the systems
that are actually generating the data. I don't think we want to repeat the
mistakes of many other standards that were either rushed through the
process (e.g. like OOXML)  or retroactively made standards because of
common usage not because of good design (e.g. Syslog being a good example
in this case).

Any specification for eventing should have, at minimum, the consensus
endorsement of the major O/S, Application and network systems vendors
(including but not limited too): Microsoft, Oracle/Sun, IBM, Red Hat,
Juniper, Cisco, Symantec, McAfee, etc...

Currently most of the "board" consists of audit reduction and analysis
folks, some security companies, and a few minor players in the market
space.

In order for CEE to adopted, the major industry players need to be
participating and agreeing to the specification. So far that really hasn't
happened.


We also do not believe RFC5424 should not be an endorsed CEE transport.
SYSLOG lacks guaranteed reliable delivery (TCP/IP is not sufficient) since
that does guarantee the audit repository understood and processed the
message. Also in section 6.1 syslog allows the receiver to arbitrarily
truncate the data without any notification to the sender  - that type of
behavior in a audit protocol should be encouraged.



Other comments inline below.






On 9/25/12 9:29 AM, "Rainer Gerhards" <[hidden email]> wrote:

>General comment: wouldn't it make more sense to finish a simple approach
>before sidetracking it with a more complex one? CEE already took a lot of
>time, and now we see some interest in the market ... and now we start all
>over with a new, much more complex thing? I see that CEE will not solve
>all use-cases, but IMHO it is far better to solve *a lot* (no percentage
>given by intention) than none (because we wait another 5 years). And if
>we start over, by the time we will almost have finshed, someone else
>shows up. This very same story happens for over 15 years...
>
>Detail comments below.
>
>> -----Original Message-----
>> From: Fletcher, Boyd C IV Mr CIV OSD [mailto:[hidden email]]
>> Sent: Tuesday, September 25, 2012 3:45 AM
>> To: [hidden email]
>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>> event data
>>
>> Comments below.
>>
>>
>>
>> On 9/24/12 6:05 PM, "Raffael Marty" <[hidden email]> wrote:
>>
>> >> our concerns about 1.0 are:
>> >>
>> >> 1) lack of support for language translations for human readable
>> >>representations
>> >
>> >The goal of CEE from the beginning was to have a machine-processable
>> >standard. May I ask, how would you use this? What does human readable
>> >mean? Like actual English sentences?
>>
>> Yes, english, german, japanese, etc... sentences that are mapped via an
>> event ID and profile ID to the raw data being stored and/or sent over
>> the
>> wire.
>>
>> Machine processable must be tied to human readable version otherwise
>> you
>> lose meaning during analysis.
>>
>> We do a huge amount of automated and semi-automated log analysis.
>> Having
>> both the raw data and the human readable version is critical to
>> effective
>> understanding of the data.
>
>If the CEE record is based on a profile, you can always assing
>human-readable text, as the semantic is known. For non-profile or
>extensions, this is currently not possible. Yes, that's right.


Inherited Event classes (or profiles) make CEE really powerful. Not having
them severely limits its flexibility and long term growth

>>
>>
>> >
>> >> 2) Microsoft Event data does not map cleanly into v1.0
>> >
>> >That's a big concern. Could you elaborate on that?
>>
>> Its dynamically extensible and it supports binary data. It also has the
>> concept of profiles.
>
>So what?

So in order to support Microsoft Events you need those capabilities along
with the ability to tie the eventID to a profile and then to a translation
for rendering (and understanding)


>
>>
>> >
>> >> 3) support for device class profiles
>> >
>> >Not sure I understand what you mean by this? device classes?
>>
>>
>> Like SNMP profiles for routers, switches, operating systems,
>> applications,
>> etcŠ
>>
>> For example, CISCO routers support both the basic IETF Router MIB and
>> the
>> CISCO specific MIB.
>>
>> We think audit events should work the same way.
>>
>> For example for RHEL,
>>
>> There would be a base O/S profile, that would be extended by  a POSIX
>> Unix
>> profile, then the Linux profile
>>
>> Base O/S Profile: USER logged in at TIME
>> Unix Profile: USER logged in at TIME on HOSTNAME via TTY
>> Linux Profile: USER logged in at TIME on HOSTNAME via TTY with
>> SELINUX_CONTEXT
>>
>> USER, TIME, HOSTNAME, TTY,and SELINUX_CONTEXT would the data sent over
>> wire and it would be tied to a event ID which is located in the
>> Profile.
>>
>> So in the Linux case, you would have an eventRecord list:
>>
>> RecordID=102848     // some machine generated unique idea for that
>> instantiation of the event
>> EventID=2284  // the Linux Login event ID
>> ProfileID=CEE-RHEL6   // some type of profile ID (UUID, URI, etcŠ) that
>> references a profile (event class) that  contains the events for that
>> system/application
>>
>> USER=boyd
>> TIME=23:29:00 24SEP2012
>> TTY=ttya
>> SELINUX_CONTEXT=admin_user_t
>>
>>
>> While the translations file would contain something like for eventID
>> 2284
>> in the english language:
>>
>> <eventTranslation eventID="2284" lang="en">
>> %USER% logged in at %TIME% on %HOSTNAME% vi %TTY% with
>> %SELINUX_CONTEXT%
>> </eventTranslation>
>>
>> A renderer would use variable substitution to  generate output like:
>>
>> "boyd logged in at 23:29:00 24SEP2012 via tty a with admin_user_t"
>>
>>
>>
>> Analytics Software would use the Profile, eventIDs, Dictionary, while
>> an
>> audit viewer application would map the values to the human readable
>> translation.
>>
>Assuming that you write the necessary software, I do not see why you
>could not do this with CEE.


CEE does not support these constructs directly. This should be explicit in
the specification.


>
>> The XML files that Phil sent have some good examples of how this would
>> work.
>>
>>
>>
>> >
>> >> 4) lack of support for digitally signed records
>> >
>> >This is a property that would be part of the transport standard. The
>> nice
>> >thing about how CEE is structured is that everything is very modular.
>> It
>> >would be fairly simple to add signatures and hash chaining as modules
>> to
>> >the transport at some point. However, I don't see signatures on
>> records
>> >as a huge demand in the market.
>>
>> Transport only will not work for the legal requirements (HIPPA,
>> Sarbanes-Oxley, etcŠ) that the O/S and applications vendors are being
>> asked to support. Transport security (e.g. TLS) only guarantees the
>> transmission wasn't corrupted not when/where/whom/how the audit record
>> was
>> generated.
>>
>> The o/s and applications vendors are under a lot of legal (gov. regs)
>> and
>> financial community pressure to support signed audit records.
>
>Here I agree: transport signature is insufficient. However, signature can
>be added to the CEE container, and last year it was planned (at least
>discussed) to do that in a later revision of the spec.

We think it should be in version 1.0 for XML portion. Unfortunately JSON
doesn't have a standardized schema specification, canonicalizaiton, or
digital signing support.


We also think there should be explicit support for EXI. EXI is far more
space and processing efficient that JSON without all of its security
problems including complete lack of validation capability.


>
>> >
>> >Right now, with CEE, we are trying to address the biggest problems:
>> >syntax, dictionary, and a taxonomy. Those three things will get us 80%
>> >the way to machine interoperability...
>>
>>
>> We don't believe the current Syntax (especially for XML) is sufficient
>> to
>> meet the demands of modern operating systems and applications. Most of
>> the
>> CEE structure that exists today appears focus on Unix/Linux Audit,
>> Routers, and Switches. After discussing this with John Wunder, he
>> agreed
>> the current approach was focused on those use cases.
>
>This, and Windows events. Eric Fitzgerald was a long-term member of a
>board before he departed Microsoft. Until last year, he made sure CEE
>contains everything Microsoft needed. So this point sounds a bit mood to
>me.

I disagree. We tried mapping Windows Events from the O/S and Applications
into CEE it could not do it cleanly.


>
>Rainer
>>
>> Did you look at the XML files that Phil sent?
>>
>>
>>
>> >
>> >Thanks
>> >
>> >  -raffy
>> >
>> >>
>> >>
>> >> I think it would provide more value to the community to release a
>> >>robust specification instead of an incomplete one.
>> >>
>> >> boyd
>> >>
>> >>
>> >>
>> >>
>> >> From: Wunder, John A. [[hidden email]]
>> >> Sent: Sunday, September 23, 2012 3:53 PM
>> >> To: [hidden email]
>> >> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>> describe
>> >>event data
>> >>
>> >> So I never really got a lot of response to this.
>> >>
>> >> After our last CEE Board call (almost two weeks ago, I've been late
>> >>sending this) there was generally agreement that considering such a
>> big
>> >>list of changes so close to 1.0 release is probably a bad idea. So
>> the
>> >>board's recommendation was to hold off on these changes until a post-
>> 1.0
>> >>release and then re-evaluate them one at a time as we find the time
>> and
>> >>the need.
>> >>
>> >> Do people here agree with that? Does anyone here want to argue for
>> >>considering one of these changes for the current version?
>> >>
>> >> A related question is why doesn't CEE have a unique event identifier
>> >>(#3 in the list of changes below) as a required field? I remember
>> seeing
>> >>it in and older version but it seems to have been removed. Does
>> anyone
>> >>know the reason for removing it? Should we consider adding it back?
>> Or
>> >>if I'm mistaken and CEE has never included it, should we add it?
>> >>
>> >> John
>> >>
>> >> >-----Original Message-----
>> >> >From: Wunder, John A. [mailto:[hidden email]]
>> >> >Sent: Wednesday, September 05, 2012 9:14 AM
>> >> >To: cee-discussion-list CEE-Related Discussion
>> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>> describe
>> >>event
>> >> >data
>> >> >
>> >> >Has anyone had a chance to look at the files that Phil sent out? If
>> >>you don't
>> >> >want to go digging through them, here's an example of an event from
>> >>their
>> >> >proposal:
>> >> >
>> >> >{
>> >> >      "eventReference":{
>> >> >         "profileReference":{
>> >> >            "uuid":"00112233-1234-5678-9abc-ed0123456789",
>> >> >            "version":1
>> >> >         },
>> >> >         "eventId":1,
>> >> >         "version":1
>> >> >      },
>> >> >      "criticality":"DEBUG",
>> >> >      "dateTime":"2001-12-31T12:00:00",
>> >> >      "recordId":"1234",
>> >> >      "fields":[
>> >> >         {
>> >> >            "value":"jsmith"
>> >> >            "name":"username",
>> >> >         },
>> >> >         {
>> >> >            "name":"hostname",
>> >> >            "value":"foo.dod.mil"
>> >> >         }
>> >> >      ]
>> >> > }
>> >> >
>> >> >As you can see, there are a few differences from CEE JSON:
>> >> >
>> >> >1. Profiles are referenced using a UUID and version rather than a
>> URI
>> >> >2. Profiles include a list of "event classes", which define
>> specific
>> >>fields that are
>> >> >required for specific event types. In contrast, CEE profiles (as
>> >>currently
>> >> >implemented) tend to just provide a list of fields you CAN or MUST
>> use
>> >>for
>> >> >that profile without providing a list of event classes.
>> >> >3. Events have a record ID (we've talked about this before but as
>> >>currently
>> >> >implemented we don't require that)
>> >> >4. Event fields are contained in a "fields" array with
>> "name"/"value"
>> >>key/value
>> >> >pairs rather than standard JSON key/value pairs.
>> >> >
>> >> >Thoughts?
>> >> >
>> >> >Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE,
>> and
>> >>prefer CEE
>> >> >to 4. Phil, can you give some explanation behind 1 and 4?
>> Specifically:
>> >> >1. Why use a UUID rather than a URI? The advantages of a URI are
>> that
>> >>you
>> >> >can make it a resolveable URL for ease of use and it provides more
>> >>human-
>> >> >readable profile references (i.e. in CEE I can tell what a profile
>> is
>> >>by looking at
>> >> >the URI, a UUID is transparent unless you know where to go to find
>> it).
>> >> >4. Why use the "fields" structure? I realize we've talked about
>> this
>> >>before, I
>> >> >just want to get the explanation on the list for everyone else to
>> see.
>> >> >
>> >> >John
>> >> >
>> >> >-----Original Message-----
>> >> >From: Wunder, John A. [mailto:[hidden email]]
>> >> >Sent: Thursday, August 30, 2012 3:36 PM
>> >> >To: cee-discussion-list CEE-Related Discussion
>> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>> discribe
>> >>event
>> >> >data
>> >> >
>> >> >I've been working with Phil and the JALoP team to help them
>> understand
>> >>CEE
>> >> >and where we're coming from as well as to understand more about
>> where
>> >> >they're coming from. So at this point I'm thinking I have a pretty
>> good
>> >> >understanding of the differences.
>> >> >
>> >> >In my opinion the most important distinction is that we have
>> different
>> >>focus
>> >> >areas. JALoP is focusing on syntax and (outside of the audit
>> schema)
>> >>the
>> >> >transport. CEE has the syntax component but also has more of a
>> focus on
>> >> >defining the event taxonomy and core field dictionary. So to that
>> >>extent if we
>> >> >move towards the JALoP schemas our existing work on the taxonomy
>> and
>> >> >field dictionary wouldn't be significantly affected. The caveat is
>> >>that we'll have
>> >> >to go back and forth a little bit on how the taxonomy ends up being
>> >> >represented, since although their concept is kind of a taxonomy
>> it's
>> >> >implemented very differently from ours.
>> >> >
>> >> >In the syntax specifically, the main philosophical difference is
>> that
>> >>CEE's
>> >> >primary focus has been JSON and lightweight uses while JALoPs has
>> been
>> >>XML
>> >> >and more "heavy" uses (that require validation, integrity, etc).
>> That
>> >>ends up
>> >> >being reflected in some of the structures that we use. The proposal
>> >>Phil has
>> >> >here includes both JSON and XML, just like CEE does, but because it
>> >>started as
>> >> >XML the JSON is influenced a little by that. Take a look at what he
>> >>sent (the
>> >> >.piz file is a .zip file w/ the schemas) and compare it to what we
>> >>have now.
>> >> >
>> >> >I think it's unlikely that we'd end up doing a full replacement of
>> the
>> >>CEE syntax
>> >> >with the JALoP audit syntax, although of course that's definitely
>> an
>> >>option. But
>> >> >in any case we should look at what's out there and take the best
>> >>pieces of
>> >> >each. If it would help I could set up a telecom where Phil and the
>> >>team can
>> >> >explain what they've done and you all can ask questions.
>> >> >
>> >> >I'd especially want to get the opinion of event record PRODUCERS
>> here,
>> >>since
>> >> >they're the one that will be using the syntax.
>> >> >
>> >> >John
>> >> >
>> >> >-----Original Message-----
>> >> >From: Philip Black-Knight [mailto:[hidden email]]
>> >> >Sent: Thursday, August 30, 2012 1:17 PM
>> >> >To: cee-discussion-list CEE-Related Discussion
>> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>> discribe
>> >>event
>> >> >data
>> >> >
>> >> >I'm working on a government funded project (JALoP). The primary
>> goals
>> >>are to
>> >> >author & implement a protocol to reliably transfer what we refer to
>> as
>> >>journal,
>> >> >audit, and log data. Much of our efforts are related to what CEE
>> >>labels the CLT.
>> >> >
>> >> >However, we've also done some work (i.e. the schemas I just sent
>> out)
>> >> >related to describing log data in a strict format. Because our
>> schemas
>> >>are
>> >> >similar (and have similar goals) we were hoping that it would be
>> >>possible to
>> >> >combine our efforts.
>> >> >
>> >> >We feel that our schemas are a lot simpler and easier to understand
>> >>than the
>> >> >current CEE approach, and would like to see the CEE format
>> transformed
>> >>into
>> >> >a something a little simpler and easier to follow.
>> >> >
>> >> >-Phil
>> >> >
>> >> >-----Original Message-----
>> >> >From: Raffael Marty [mailto:[hidden email]]
>> >> >Sent: Thursday, August 30, 2012 12:12 PM
>> >> >To: Philip Black-Knight
>> >> >Cc: [hidden email]
>> >> >Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>> discribe
>> >>event
>> >> >data
>> >> >
>> >> >Hi Phil,
>> >> >
>> >> >When you say "we have been ...", who is we?
>> >> >
>> >> >The goals you list are pretty much the ones we are addressing with
>> >>CEE. Glad
>> >> >we are all on the same page. Would love to hear your specific input
>> on
>> >>CEE.
>> >> >
>> >> >Thanks
>> >> >
>> >> >  -raffy
>> >> >
>> >> >--
>> >> >  Raffael Marty
>> >> >  @raffaelmarty
>> >>http://raffy.ch
>> >> >
>> >> >On Aug 30, 2012, at 5:15 AM, Philip Black-Knight
>> <[hidden email]>
>> >> >wrote:
>> >> >
>> >> >> We've been working on a similar effort for describing an
>> enumerating
>> >> >events. Our architecture is comprised of 3 distinct documents:
>> >> >>
>> >> >> -       EventProfileList (profiles.xsd) - An EventProfileList is
>> an
>> >>enumeration of
>> >> >events for a class of device (e.g. router, Cisco router, UNIX
>> system,
>> >>Solaris).
>> >> >The EventProfileList can be for a generic device (e.g. router,
>> switch,
>> >>UNIX
>> >> >system) or specific to a vender/product (e.g. Cisco Router, Juniper
>> >>switch,
>> >> >Oracle Solaris). This is similar to how SNMP works. The type for
>> each
>> >>field is
>> >> >recorded in the EventProfile, and would be one of the existing CEE
>> >>dictionary
>> >> >types.
>> >> >>
>> >> >> -       Events (events.xsd) - An Events document is a listing of
>> one
>> >>or more
>> >> >event instances (called records). Each record references an Event
>> from
>> >>a
>> >> >specific EventProfile.
>> >> >>
>> >> >> -       Translations (translations.xsd) - A Translation document
>> >>provides a
>> >> >human readable, language specific, version of the events enumerated
>> in
>> >>an
>> >> >EventProfileList. To display the details of an Event, you would use
>> a
>> >>record
>> >> >from an Events document, combined with the translation from a
>> >>Translations
>> >> >document, to render a human readable string in the user's language
>> of
>> >>choice.
>> >> >>
>> >> >> For example, an EventProfileList could include an event that
>> >>indicates a user
>> >> >logged on to a particular system. In the EventProfileList, the
>> >>descriptor for this
>> >> >login event would indicate the required fields: user_name,
>> hostname,
>> >>and tty.
>> >> >When a login event occurs, the system would record the event in an
>> >>Events
>> >> >document, including the values for these fields. A Translations
>> >>document (for
>> >> >US English) might include the format string: "%user_name% logged on
>> to
>> >> >%hostname% on %tty% at %timestamp%". Using this format string, and
>> the
>> >> >event record from the Events document, a human readable string can
>> be
>> >> >rendered for the user.
>> >> >>
>> >> >> One of the primary goals is to move away from loosely formatted
>> event
>> >> >messages which are hard to parse, and can change from one version
>> of
>> >> >software package to the next. Using a common format to enumerate
>> event
>> >> >records makes processing and correlating data simpler. This is all
>> >>part of a
>> >> >larger effort to define data formats and protocols for reliably
>> >>transferring
>> >> >Journal/Audit/Logging events. Additionally, we'd like to be able to
>> >>align
>> >> >various software products that perform similar tasks (i.e. routers,
>> or
>> >>network
>> >> >switches, etc) to all use a common set of events so that a router
>> >>built by Cisco
>> >> >& a router built by Juniper Networks would both generate the same
>> >>events.
>> >> >Each vendor would be free to augment their list of events (by
>> >>providing their
>> >> >own EventProfileLists) and would be able to define new events,
>> and/or
>> >> >extend existing events.
>> >> >>
>> >> >> Existing syslog data is easily recorded using this approach. The
>> >>EventProfile
>> >> >for syslog data indicates the required fields for msg (the actual
>> >>syslog
>> >> >message), facility, priority, hostname, applicationName, etc that
>> are
>> >>part of
>> >> >standard syslog messages. The attached zip includes an
>> >>EventProfileList &
>> >> >sample Events document for syslog.
>> >> >>
>> >> >>
>> >> >> Another important goal was to support existing logging systems.
>> While
>> >> >creating these schemas, we studied Syslog, Windows Event Structure,
>> >>Solaris
>> >> >Audit, and Linux Audit and believe that all can be represented in
>> our
>> >>Event
>> >> >structures with no loss of information or context.
>> >> >>
>> >> >> Other features:
>> >> >>
>> >> >> -       Support optional digital signatures of all XML files to
>> meet
>> >>government
>> >> >requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit
>> log
>> >> >integrity and non-repudiation
>> >> >>
>> >> >> -       Updates to language translations (both new languages and
>> to
>> >>fix errors
>> >> >in translations) can be added without the need to update a single
>> line
>> >>of code.
>> >> >>
>> >> >> -       Both the EventProfileLists and the Event descriptors
>> within
>> >>the
>> >> >EventProfile list are versioned, allowing event records &
>> translations
>> >>to target
>> >> >a specific version of an event.
>> >> >>
>> >> >> -       Documents containing Event records may be in XML or JSON
>> >>format
>> >> >>
>> >> >> -       Limited support for Event inheritance. Each Event
>> descriptor
>> >>can inherit
>> >> >from some other Event descriptor (multiple inheritance is not
>> >>supported). For
>> >> >example, a generic profile could exist that describes a "logon"
>> event,
>> >>and
>> >> >specifies the required fields username, hostname, and time. The
>> >> >EventProfileList for Windows would identify a login event that
>> >>inherits from
>> >> >this base, and adds the fields first_name, last_name, guid, etc.
>> The
>> >> >EventProfileList for Linux would add fields for uid, tty, etc. When
>> a
>> >>Linux Logon
>> >> >event is recorded, in addition to the uid and tty, it must also
>> >>specify the fields
>> >> >from the inherited class (i.e. username, hostname, and time). This
>> is
>> >>similar to
>> >> >current CEE approach to taxonomies.
>> >> >>
>> >> >>
>> >> >> To further support this effort, we've been working on a network
>> >> >> protocol to reliably and securely transfer Event records off box.
>> >>These
>> >> >protocols are designed to support raw XML as well as compressed
>> formats
>> >> >such as Efficient XML Interchange Attached are a draft set of
>> schemas
>> >>and
>> >> >sample documents (it is just a renamed zip file).
>> >> >>
>> >> >> -Phil
>> >> >>
>> >> >> <eventSchemas.1814d42.piz>
>> >
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

Keith Robertson
Boyd,

This is a fairly long thread, but I'll try to address a few points in
the thread.

EventID: While there isn't an exact mapping to this variable; however,
there is 'msgid' in the core-profile.xsd.  Since this field is a string
type you could easily put a UUID, IBM style message ID, or something
else in there.

Profiles: I've looked at the latest version of cee.xsd and
cee-profile.xsd and this is how *I* would create a 'profile' or a vendor
extension. AFAIK it is compliant with the spec.

Step 1: Write a XML schema for your profile/extension (example in [1]).  
Notice that types extend from cee*.xsd
Step 2: Use the <native/> element and the "profile" attribute within the
<cee:event/> element to add in your vendor specific profile.  Look at
[2] for a valid XML example and [3] for the same example in JSON format.

Cheers,
Keith

[1]
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.netfilter.org/projects/iptables/cee"
     xmlns:tns="http://www.netfilter.org/projects/iptables/cee"
     elementFormDefault="qualified" xmlns:Q1="http://lumberjack.org"
     xmlns:cee="http://cee.mitre.org/1.0/">
     <xs:import namespace="http://cee.mitre.org/1.0/"
schemaLocation="cee.xsd" />
     <complexType name="IPTablesTypeDenyType">
         <complexContent>
             <sequence>
                 <element name="rule" minOccurs="1" maxOccurs="1"
type="cee:StringField" />
                 <element name="ifName" minOccurs="1" maxOccurs="1"
type="cee:StringField" />
                 <element name="srcIP" minOccurs="1" maxOccurs="1"
type="cee:IPv4Field" />
                 <element name="srcPort" minOccurs="1" maxOccurs="1"
type="positiveInteger" />
                 <element name="destIP" minOccurs="1" maxOccurs="1"
type="cee:IPv4Field" />
                 <element name="destPort" minOccurs="1" maxOccurs="1"
type="positiveInteger" />
                 <element name="HWAddr" minOccurs="1" maxOccurs="1"
type="cee:StringField" />
             </sequence>
         </complexContent>
     </complexType>
     <element name="IPTablesDeny" type="tns:IPTablesTypeDenyType"></element>
</schema>

[2]  Note that this XML document is both valid *and* well-formed.
<?xml version="1.0" encoding="UTF-8"?>
<p:cee xmlns:p="http://cee.mitre.org/1.0/"
xmlns:p1="http://cee.mitre.org/1.0/profile/"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://cee.mitre.org/1.0/ cee.xsd ">
     <p:event profile="http://www.netfilter.org/projects/iptables/cee">
         <p1:host type="string">example.com</p1:host>
         <p1:pname type="string">apache httpd</p1:pname>
         <p1:time type="string">2012-09-26T10:37:31</p1:time>
         <p:native>
             <tns:IPTablesDeny
xmlns:tns="http://www.netfilter.org/projects/iptables/cee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.netfilter.org/projects/iptables/cee 
iptables.xsd ">
                 <tns:rule>DMZ blacklist</tns:rule>
                 <tns:ifName>eth0</tns:ifName>
                 <tns:srcIP>192.168.1.10</tns:srcIP>
                 <tns:srcPort>80</tns:srcPort>
                 <tns:destIP>192.168.1.15</tns:destIP>
                 <tns:destPort>8080</tns:destPort>
<tns:HWAddr>52:54:00:70:82:5E</tns:HWAddr>
             </tns:IPTablesDeny>
         </p:native>
     </p:event>
</p:cee>

[3]
{
   "@xmlns:p": "http://cee.mitre.org/1.0/",
   "@xmlns:p1": "http://cee.mitre.org/1.0/profile/",
   "@xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
   "@xsi:schemaLocation": "http://cee.mitre.org/1.0/ cee.xsd ",
   "p:event":   {
     "@profile": "http://www.netfilter.org/projects/iptables/cee",
     "p1:host":     {
       "@xmlns:p1": "http://cee.mitre.org/1.0/profile/",
       "#text": "example.com"
     },
     "p1:pname":     {
       "@xmlns:p1": "http://cee.mitre.org/1.0/profile/",
       "#text": "apache httpd"
     },
     "p1:time":     {
       "@xmlns:p1": "http://cee.mitre.org/1.0/profile/",
       "#text": "2012-09-26T10:37:31"
     },
     "p:native": {"tns:IPTablesDeny":     {
       "@xmlns:tns": "http://www.netfilter.org/projects/iptables/cee",
       "@xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
       "@xsi:schemaLocation":
"http://www.netfilter.org/projects/iptables/cee iptables.xsd ",
       "tns:rule": "DMZ blacklist",
       "tns:ifName": "eth0",
       "tns:srcIP": "192.168.1.10",
       "tns:srcPort": "80",
       "tns:destIP": "192.168.1.15",
       "tns:destPort": "8080",
       "tns:HWAddr": "52:54:00:70:82:5E"
     }}
   }
}

On 09/25/2012 09:46 PM, Fletcher, Boyd C IV Mr CIV OSD wrote:

> Have you actually reviewed the XML that phil sent? The changes we are
> advocating while significant to the XML portion of the spec are not major
> changes to the JSON portion. It certainly would not take 5 years to adopt
> the changes - more likely a couple months.
>
> And no I think it is better to spend the time to get it right the first
> time than to push out a spec that does not meet the needs for the systems
> that are actually generating the data. I don't think we want to repeat the
> mistakes of many other standards that were either rushed through the
> process (e.g. like OOXML)  or retroactively made standards because of
> common usage not because of good design (e.g. Syslog being a good example
> in this case).
>
> Any specification for eventing should have, at minimum, the consensus
> endorsement of the major O/S, Application and network systems vendors
> (including but not limited too): Microsoft, Oracle/Sun, IBM, Red Hat,
> Juniper, Cisco, Symantec, McAfee, etc...
>
> Currently most of the "board" consists of audit reduction and analysis
> folks, some security companies, and a few minor players in the market
> space.
>
> In order for CEE to adopted, the major industry players need to be
> participating and agreeing to the specification. So far that really hasn't
> happened.
>
>
> We also do not believe RFC5424 should not be an endorsed CEE transport.
> SYSLOG lacks guaranteed reliable delivery (TCP/IP is not sufficient) since
> that does guarantee the audit repository understood and processed the
> message. Also in section 6.1 syslog allows the receiver to arbitrarily
> truncate the data without any notification to the sender  - that type of
> behavior in a audit protocol should be encouraged.
>
>
>
> Other comments inline below.
>
>
>
>
>
>
> On 9/25/12 9:29 AM, "Rainer Gerhards" <[hidden email]> wrote:
>
>> General comment: wouldn't it make more sense to finish a simple approach
>> before sidetracking it with a more complex one? CEE already took a lot of
>> time, and now we see some interest in the market ... and now we start all
>> over with a new, much more complex thing? I see that CEE will not solve
>> all use-cases, but IMHO it is far better to solve *a lot* (no percentage
>> given by intention) than none (because we wait another 5 years). And if
>> we start over, by the time we will almost have finshed, someone else
>> shows up. This very same story happens for over 15 years...
>>
>> Detail comments below.
>>
>>> -----Original Message-----
>>> From: Fletcher, Boyd C IV Mr CIV OSD [mailto:[hidden email]]
>>> Sent: Tuesday, September 25, 2012 3:45 AM
>>> To: [hidden email]
>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>> event data
>>>
>>> Comments below.
>>>
>>>
>>>
>>> On 9/24/12 6:05 PM, "Raffael Marty" <[hidden email]> wrote:
>>>
>>>>> our concerns about 1.0 are:
>>>>>
>>>>> 1) lack of support for language translations for human readable
>>>>> representations
>>>> The goal of CEE from the beginning was to have a machine-processable
>>>> standard. May I ask, how would you use this? What does human readable
>>>> mean? Like actual English sentences?
>>> Yes, english, german, japanese, etc... sentences that are mapped via an
>>> event ID and profile ID to the raw data being stored and/or sent over
>>> the
>>> wire.
>>>
>>> Machine processable must be tied to human readable version otherwise
>>> you
>>> lose meaning during analysis.
>>>
>>> We do a huge amount of automated and semi-automated log analysis.
>>> Having
>>> both the raw data and the human readable version is critical to
>>> effective
>>> understanding of the data.
>> If the CEE record is based on a profile, you can always assing
>> human-readable text, as the semantic is known. For non-profile or
>> extensions, this is currently not possible. Yes, that's right.
>
> Inherited Event classes (or profiles) make CEE really powerful. Not having
> them severely limits its flexibility and long term growth
>
>>>
>>>>> 2) Microsoft Event data does not map cleanly into v1.0
>>>> That's a big concern. Could you elaborate on that?
>>> Its dynamically extensible and it supports binary data. It also has the
>>> concept of profiles.
>> So what?
> So in order to support Microsoft Events you need those capabilities along
> with the ability to tie the eventID to a profile and then to a translation
> for rendering (and understanding)
>
>
>>>>> 3) support for device class profiles
>>>> Not sure I understand what you mean by this? device classes?
>>>
>>> Like SNMP profiles for routers, switches, operating systems,
>>> applications,
>>> etcŠ
>>>
>>> For example, CISCO routers support both the basic IETF Router MIB and
>>> the
>>> CISCO specific MIB.
>>>
>>> We think audit events should work the same way.
>>>
>>> For example for RHEL,
>>>
>>> There would be a base O/S profile, that would be extended by  a POSIX
>>> Unix
>>> profile, then the Linux profile
>>>
>>> Base O/S Profile: USER logged in at TIME
>>> Unix Profile: USER logged in at TIME on HOSTNAME via TTY
>>> Linux Profile: USER logged in at TIME on HOSTNAME via TTY with
>>> SELINUX_CONTEXT
>>>
>>> USER, TIME, HOSTNAME, TTY,and SELINUX_CONTEXT would the data sent over
>>> wire and it would be tied to a event ID which is located in the
>>> Profile.
>>>
>>> So in the Linux case, you would have an eventRecord list:
>>>
>>> RecordID=102848     // some machine generated unique idea for that
>>> instantiation of the event
>>> EventID=2284  // the Linux Login event ID
>>> ProfileID=CEE-RHEL6   // some type of profile ID (UUID, URI, etcŠ) that
>>> references a profile (event class) that  contains the events for that
>>> system/application
>>>
>>> USER=boyd
>>> TIME=23:29:00 24SEP2012
>>> TTY=ttya
>>> SELINUX_CONTEXT=admin_user_t
>>>
>>>
>>> While the translations file would contain something like for eventID
>>> 2284
>>> in the english language:
>>>
>>> <eventTranslation eventID="2284" lang="en">
>>> %USER% logged in at %TIME% on %HOSTNAME% vi %TTY% with
>>> %SELINUX_CONTEXT%
>>> </eventTranslation>
>>>
>>> A renderer would use variable substitution to  generate output like:
>>>
>>> "boyd logged in at 23:29:00 24SEP2012 via tty a with admin_user_t"
>>>
>>>
>>>
>>> Analytics Software would use the Profile, eventIDs, Dictionary, while
>>> an
>>> audit viewer application would map the values to the human readable
>>> translation.
>>>
>> Assuming that you write the necessary software, I do not see why you
>> could not do this with CEE.
>
> CEE does not support these constructs directly. This should be explicit in
> the specification.
>
>
>>> The XML files that Phil sent have some good examples of how this would
>>> work.
>>>
>>>
>>>
>>>>> 4) lack of support for digitally signed records
>>>> This is a property that would be part of the transport standard. The
>>> nice
>>>> hackingthing about how CEE is structured is that everything is very modular.
>>> It
>>>> would be fairly simple to add signatures and hash chaining as modules
>>> to
>>>> the transport at some point. However, I don't see signatures on
>>> records
>>>> as a huge demand in the market.
>>> Transport only will not work for the legal requirements (HIPPA,
>>> Sarbanes-Oxley, etcŠ) that the O/S and applications vendors are being
>>> asked to support. Transport security (e.g. TLS) only guarantees the
>>> transmission wasn't corrupted not when/where/whom/how the audit record
>>> was
>>> generated.
>>>
>>> The o/s and applications vendors are under a lot of legal (gov. regs)
>>> and
>>> financial community pressure to support signed audit records.
>> Here I agree: transport signature is insufficient. However, signature can
>> be added to the CEE container, and last year it was planned (at least
>> discussed) to do that in a later revision of the spec.
> We think it should be in version 1.0 for XML portion. Unfortunately JSON
> doesn't have a standardized schema specification, canonicalizaiton, or
> digital signing support.
>
>
> We also think there should be explicit support for EXI. EXI is far more
> space and processing efficient that JSON without all of its security
> problems including complete lack of validation capability.
>
>
>>>> Right now, with CEE, we are trying to address the biggest problems:
>>>> syntax, dictionary, and a taxonomy. Those three things will get us 80%
>>>> the way to machine interoperability...
>>>
>>> We don't believe the current Syntax (especially for XML) is sufficient
>>> to
>>> meet the demands of modern operating systems and applications. Most of
>>> the
>>> CEE structure that exists today appears focus on Unix/Linux Audit,
>>> Routers, and Switches. After discussing this with John Wunder, he
>>> agreed
>>> the current approach was focused on those use cases.
>> This, and Windows events. Eric Fitzgerald was a long-term member of a
>> board before he departed Microsoft. Until last year, he made sure CEE
>> contains everything Microsoft needed. So this point sounds a bit mood to
>> me.
> I disagree. We tried mapping Windows Events from the O/S and Applications
> into CEE it could not do it cleanly.
>
>
>> Rainer
>>> Did you look at the XML files that Phil sent?
>>>
>>>
>>>
>>>> Thanks
>>>>
>>>>   -raffy
>>>>
>>>>>
>>>>> I think it would provide more value to the community to release a
>>>>> robust specification instead of an incomplete one.
>>>>>
>>>>> boyd
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> From: Wunder, John A. [[hidden email]]
>>>>> Sent: Sunday, September 23, 2012 3:53 PM
>>>>> To: [hidden email]
>>>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>>> describe
>>>>> event data
>>>>>
>>>>> So I never really got a lot of response to this.
>>>>>
>>>>> After our last CEE Board call (almost two weeks ago, I've been late
>>>>> sending this) there was generally agreement that considering such a
>>> big
>>>>> list of changes so close to 1.0 release is probably a bad idea. So
>>> the
>>>>> board's recommendation was to hold off on these changes until a post-
>>> 1.0
>>>>> release and then re-evaluate them one at a time as we find the time
>>> and
>>>>> the need.
>>>>>
>>>>> Do people here agree with that? Does anyone here want to argue for
>>>>> considering one of these changes for the current version?
>>>>>
>>>>> A related question is why doesn't CEE have a unique event identifier
>>>>> (#3 in the list of changes below) as a required field? I remember
>>> seeing
>>>>> it in and older version but it seems to have been removed. Does
>>> anyone
>>>>> know the reason for removing it? Should we consider adding it back?
>>> Or
>>>>> if I'm mistaken and CEE has never included it, should we add it?
>>>>>
>>>>> John
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wunder, John A. [mailto:[hidden email]]
>>>>>> Sent: Wednesday, September 05, 2012 9:14 AM
>>>>>> To: cee-discussion-list CEE-Related Discussion
>>>>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>>> describe
>>>>> event
>>>>>> data
>>>>>>
>>>>>> Has anyone had a chance to look at the files that Phil sent out? If
>>>>> you don't
>>>>>> want to go digging through them, here's an example of an event from
>>>>> their
>>>>>> proposal:
>>>>>>
>>>>>> {
>>>>>>       "eventReference":{
>>>>>>          "profileReference":{
>>>>>>             "uuid":"00112233-1234-5678-9abc-ed0123456789",
>>>>>>             "version":1
>>>>>>          },
>>>>>>          "eventId":1,
>>>>>>          "version":1
>>>>>>       },
>>>>>>       "criticality":"DEBUG",
>>>>>>       "dateTime":"2001-12-31T12:00:00",
>>>>>>       "recordId":"1234",
>>>>>>       "fields":[
>>>>>>          {
>>>>>>             "value":"jsmith"
>>>>>>             "name":"username",
>>>>>>          },
>>>>>>          {
>>>>>>             "name":"hostname",
>>>>>>             "value":"foo.dod.mil"
>>>>>>          }
>>>>>>       ]
>>>>>> }
>>>>>>
>>>>>> As you can see, there are a few differences from CEE JSON:
>>>>>>
>>>>>> 1. Profiles are referenced using a UUID and version rather than a
>>> URI
>>>>>> 2. Profiles include a list of "event classes", which define
>>> specific
>>>>> fields that are
>>>>>> required for specific event types. In contrast, CEE profiles (as
>>>>> currently
>>>>>> implemented) tend to just provide a list of fields you CAN or MUST
>>> use
>>>>> for
>>>>>> that profile without providing a list of event classes.
>>>>>> 3. Events have a record ID (we've talked about this before but as
>>>>> currently
>>>>>> implemented we don't require that)
>>>>>> 4. Event fields are contained in a "fields" array with
>>> "name"/"value"
>>>>> key/value
>>>>>> pairs rather than standard JSON key/value pairs.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> Personally I'm undecided on 1, prefer 2 to CEE, prefer 3 to CEE,
>>> and
>>>>> prefer CEE
>>>>>> to 4. Phil, can you give some explanation behind 1 and 4?
>>> Specifically:
>>>>>> 1. Why use a UUID rather than a URI? The advantages of a URI are
>>> that
>>>>> you
>>>>>> can make it a resolveable URL for ease of use and it provides more
>>>>> human-
>>>>>> readable profile references (i.e. in CEE I can tell what a profile
>>> is
>>>>> by looking at
>>>>>> the URI, a UUID is transparent unless you know where to go to find
>>> it).
>>>>>> 4. Why use the "fields" structure? I realize we've talked about
>>> this
>>>>> before, I
>>>>>> just want to get the explanation on the list for everyone else to
>>> see.
>>>>>> John
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wunder, John A. [mailto:[hidden email]]
>>>>>> Sent: Thursday, August 30, 2012 3:36 PM
>>>>>> To: cee-discussion-list CEE-Related Discussion
>>>>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>>> discribe
>>>>> event
>>>>>> data
>>>>>>
>>>>>> I've been working with Phil and the JALoP team to help them
>>> understand
>>>>> CEE
>>>>>> and where we're coming from as well as to understand more about
>>> where
>>>>>> they're coming from. So at this point I'm thinking I have a pretty
>>> good
>>>>>> understanding of the differences.
>>>>>>
>>>>>> In my opinion the most important distinction is that we have
>>> different
>>>>> focus
>>>>>> areas. JALoP is focusing on syntax and (outside of the audit
>>> schema)
>>>>> the
>>>>>> transport. CEE has the syntax component but also has more of a
>>> focus on
>>>>>> defining the event taxonomy and core field dictionary. So to that
>>>>> extent if we
>>>>>> move towards the JALoP schemas our existing work on the taxonomy
>>> and
>>>>>> field dictionary wouldn't be significantly affected. The caveat is
>>>>> that we'll have
>>>>>> to go back and forth a little bit on how the taxonomy ends up being
>>>>>> represented, since although their concept is kind of a taxonomy
>>> it's
>>>>>> implemented very differently from ours.
>>>>>>
>>>>>> In the syntax specifically, the main philosophical difference is
>>> that
>>>>> CEE's
>>>>>> primary focus has been JSON and lightweight uses while JALoPs has
>>> been
>>>>> XML
>>>>>> and more "heavy" uses (that require validation, integrity, etc).
>>> That
>>>>> ends up
>>>>>> being reflected in some of the structures that we use. The proposal
>>>>> Phil has
>>>>>> here includes both JSON and XML, just like CEE does, but because it
>>>>> started as
>>>>>> XML the JSON is influenced a little by that. Take a look at what he
>>>>> sent (the
>>>>>> .piz file is a .zip file w/ the schemas) and compare it to what we
>>>>> have now.
>>>>>> I think it's unlikely that we'd end up doing a full replacement of
>>> the
>>>>> CEE syntax
>>>>>> with the JALoP audit syntax, although of course that's definitely
>>> an
>>>>> option. But
>>>>>> in any case we should look at what's out there and take the best
>>>>> pieces of
>>>>>> each. If it would help I could set up a telecom where Phil and the
>>>>> team can
>>>>>> explain what they've done and you all can ask questions.
>>>>>>
>>>>>> I'd especially want to get the opinion of event record PRODUCERS
>>> here,
>>>>> since
>>>>>> they're the one that will be using the syntax.
>>>>>>
>>>>>> John
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Philip Black-Knight [mailto:[hidden email]]
>>>>>> Sent: Thursday, August 30, 2012 1:17 PM
>>>>>> To: cee-discussion-list CEE-Related Discussion
>>>>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>>> discribe
>>>>> event
>>>>>> data
>>>>>>
>>>>>> I'm working on a government funded project (JALoP). The primary
>>> goals
>>>>> are to
>>>>>> author & implement a protocol to reliably transfer what we refer to
>>> as
>>>>> journal,
>>>>>> audit, and log data. Much of our efforts are related to what CEE
>>>>> labels the CLT.
>>>>>> However, we've also done some work (i.e. the schemas I just sent
>>> out)
>>>>>> related to describing log data in a strict format. Because our
>>> schemas
>>>>> are
>>>>>> similar (and have similar goals) we were hoping that it would be
>>>>> possible to
>>>>>> combine our efforts.
>>>>>>
>>>>>> We feel that our schemas are a lot simpler and easier to understand
>>>>> than the
>>>>>> current CEE approach, and would like to see the CEE format
>>> transformed
>>>>> into
>>>>>> a something a little simpler and easier to follow.
>>>>>>
>>>>>> -Phil
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Raffael Marty [mailto:[hidden email]]
>>>>>> Sent: Thursday, August 30, 2012 12:12 PM
>>>>>> To: Philip Black-Knight
>>>>>> Cc: [hidden email]
>>>>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally
>>> discribe
>>>>> event
>>>>>> data
>>>>>>
>>>>>> Hi Phil,
>>>>>>
>>>>>> When you say "we have been ...", who is we?
>>>>>>
>>>>>> The goals you list are pretty much the ones we are addressing with
>>>>> CEE. Glad
>>>>>> we are all on the same page. Would love to hear your specific input
>>> on
>>>>> CEE.
>>>>>> Thanks
>>>>>>
>>>>>>   -raffy
>>>>>>
>>>>>> --
>>>>>>   Raffael Marty
>>>>>>   @raffaelmarty
>>>>> http://raffy.ch
>>>>>> On Aug 30, 2012, at 5:15 AM, Philip Black-Knight
>>> <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> We've been working on a similar effort for describing an
>>> enumerating
>>>>>> events. Our architecture is comprised of 3 distinct documents:
>>>>>>> -       EventProfileList (profiles.xsd) - An EventProfileList is
>>> an
>>>>> enumeration of
>>>>>> events for a class of device (e.g. router, Cisco router, UNIX
>>> system,
>>>>> Solaris).
>>>>>> The EventProfileList can be for a generic device (e.g. router,
>>> switch,
>>>>> UNIX
>>>>>> system) or specific to a vender/product (e.g. Cisco Router, Juniper
>>>>> switch,
>>>>>> Oracle Solaris). This is similar to how SNMP works. The type for
>>> each
>>>>> field is
>>>>>> recorded in the EventProfile, and would be one of the existing CEE
>>>>> dictionary
>>>>>> types.
>>>>>>> -       Events (events.xsd) - An Events document is a listing of
>>> one
>>>>> or more
>>>>>> event instances (called records). Each record references an Event
>>> from
>>>>> a
>>>>>> specific EventProfile.
>>>>>>> -       Translations (translations.xsd) - A Translation document
>>>>> provides a
>>>>>> human readable, language specific, version of the events enumerated
>>> in
>>>>> an
>>>>>> EventProfileList. To display the details of an Event, you would use
>>> a
>>>>> record
>>>>> >from an Events document, combined with the translation from a
>>>>> Translations
>>>>>> document, to render a human readable string in the user's language
>>> of
>>>>> choice.
>>>>>>> For example, an EventProfileList could include an event that
>>>>> indicates a user
>>>>>> logged on to a particular system. In the EventProfileList, the
>>>>> descriptor for this
>>>>>> login event would indicate the required fields: user_name,
>>> hostname,
>>>>> and tty.
>>>>>> When a login event occurs, the system would record the event in an
>>>>> Events
>>>>>> document, including the values for these fields. A Translations
>>>>> document (for
>>>>>> US English) might include the format string: "%user_name% logged on
>>> to
>>>>>> %hostname% on %tty% at %timestamp%". Using this format string, and
>>> the
>>>>>> event record from the Events document, a human readable string can
>>> be
>>>>>> rendered for the user.
>>>>>>> One of the primary goals is to move away from loosely formatted
>>> event
>>>>>> messages which are hard to parse, and can change from one version
>>> of
>>>>>> software package to the next. Using a common format to enumerate
>>> event
>>>>>> records makes processing and correlating data simpler. This is all
>>>>> part of a
>>>>>> larger effort to define data formats and protocols for reliably
>>>>> transferring
>>>>>> Journal/Audit/Logging events. Additionally, we'd like to be able to
>>>>> align
>>>>>> various software products that perform similar tasks (i.e. routers,
>>> or
>>>>> network
>>>>>> switches, etc) to all use a common set of events so that a router
>>>>> built by Cisco
>>>>>> & a router built by Juniper Networks would both generate the same
>>>>> events.
>>>>>> Each vendor would be free to augment their list of events (by
>>>>> providing their
>>>>>> own EventProfileLists) and would be able to define new events,
>>> and/or
>>>>>> extend existing events.
>>>>>>> Existing syslog data is easily recorded using this approach. The
>>>>> EventProfile
>>>>>> for syslog data indicates the required fields for msg (the actual
>>>>> syslog
>>>>>> message), facility, priority, hostname, applicationName, etc that
>>> are
>>>>> part of
>>>>>> standard syslog messages. The attached zip includes an
>>>>> EventProfileList &
>>>>>> sample Events document for syslog.
>>>>>>>
>>>>>>> Another important goal was to support existing logging systems.
>>> While
>>>>>> creating these schemas, we studied Syslog, Windows Event Structure,
>>>>> Solaris
>>>>>> Audit, and Linux Audit and believe that all can be represented in
>>> our
>>>>> Event
>>>>>> structures with no loss of information or context.
>>>>>>> Other features:
>>>>>>>
>>>>>>> -       Support optional digital signatures of all XML files to
>>> meet
>>>>> government
>>>>>> requirement (Sarbanes-Oxley, Frank-Dodd, HIPAA, etc...) s for audit
>>> log
>>>>>> integrity and non-repudiation
>>>>>>> -       Updates to language translations (both new languages and
>>> to
>>>>> fix errors
>>>>>> in translations) can be added without the need to update a single
>>> line
>>>>> of code.
>>>>>>> -       Both the EventProfileLists and the Event descriptors
>>> within
>>>>> the
>>>>>> EventProfile list are versioned, allowing event records &
>>> translations
>>>>> to target
>>>>>> a specific version of an event.
>>>>>>> -       Documents containing Event records may be in XML or JSON
>>>>> format
>>>>>>> -       Limited support for Event inheritance. Each Event
>>> descriptor
>>>>> can inherit
>>>>> >from some other Event descriptor (multiple inheritance is not
>>>>> supported). For
>>>>>> example, a generic profile could exist that describes a "logon"
>>> event,
>>>>> and
>>>>>> specifies the required fields username, hostname, and time. The
>>>>>> EventProfileList for Windows would identify a login event that
>>>>> inherits from
>>>>>> this base, and adds the fields first_name, last_name, guid, etc.
>>> The
>>>>>> EventProfileList for Linux would add fields for uid, tty, etc. When
>>> a
>>>>> Linux Logon
>>>>>> event is recorded, in addition to the uid and tty, it must also
>>>>> specify the fields
>>>>> >from the inherited class (i.e. username, hostname, and time). This
>>> is
>>>>> similar to
>>>>>> current CEE approach to taxonomies.
>>>>>>>
>>>>>>> To further support this effort, we've been working on a network
>>>>>>> protocol to reliably and securely transfer Event records off box.
>>>>> These
>>>>>> protocols are designed to support raw XML as well as compressed
>>> formats
>>>>>> such as Efficient XML Interchange Attached are a draft set of
>>> schemas
>>>>> and
>>>>>> sample documents (it is just a renamed zip file).
>>>>>>> -Phil
>>>>>>>
>>>>>>> <eventSchemas.1814d42.piz>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Similar effort to formally describe event data

william.leroy
In reply to this post by Adam Montville
Adam,

I believe global across all enterprises this is ideal, for Service
Providers or across Service Providers
you may have an internal organization that is either global or
regional and you may have a outside service provider that is regional
or
global depending on the event information being collected.  One
organization might be running a different software but want to forward
to them a common event that all parties would understand and
understand the original source of the device. If the device ID was
unique
and the event ID was unique, I would probably have a chance of
understanding where it came from originally.




On Mon, Sep 24, 2012 at 12:53 PM, Adam Montville
<[hidden email]> wrote:

> On 9/24/12 9:44 AM, "Raffael Marty" <[hidden email]> wrote:
>
>>#3 is completely different from the other semantics. That's a sessionID,
>>which should not be in this field. Different field, different discussion.
>
> Agree.
>
>>I think the idea here is a unique ID. I don't like #1. You can just use
>>hash(data) as a UUID. Why require that? I think in general, I am against
>>requiring it. I think we need it in general though.
>
> Which hash would you use and what are the precise uniqueness requirements?
>  Global in the context of a single enterprise, or global across all
> enterprises?  How many events would that mean we'd be generating IDs for
> over what period of time?  Then, what are the collision rates of the
> chosen hash?  It'll probably all work out, but being precise here might be
> prudent.
>
> About setting it as a requirement, what are the interoperability
> implications if we don't require the field to be populated?
>
>
>>
>>  -raffy
>>
>>--
>>  Raffael Marty
>>  ceo @ pixlcloud
>>http://pixlcloud.com
>>  @raffaelmarty                                          http://raffy.ch
>>
>>
>>
>>On Sep 24, 2012, at 9:39 AM, "Wunder, John A." <[hidden email]> wrote:
>>
>>> Definitely an important question. But, it's almost important to not
>>>overthink it.
>>>
>>> Some of the options I can think of are:
>>>
>>> 1. UUID. Upside is a guarantee of uniqueness, downside might be
>>>generation time.
>>> 2. Unique string or integer assigned by producer. This could be
>>>implemented as an incrementing ID but no reason to require that in the
>>>spec. Basically the producer would just assign unique IDs as a best
>>>effort and consumers would need to know that there could be conflicts
>>>from different producers.
>>> 3. #2 plus add a producer (or "session") ID. That ID could be a UUID
>>>since it wouldn't need to be generated as often, but would give the
>>>producer some namespace in which to produce unique IDs.
>>>
>>> Another thing to consider is whether the recordId would be required or
>>>optional.
>>>
>>> John
>>> ________________________________________
>>> From: Adam Montville [[hidden email]]
>>> Sent: Monday, September 24, 2012 12:18 PM
>>> To: cee-discussion-list CEE-Related Discussion
>>> Subject: Re: [CEE-DISCUSSION-LIST] Similar effort to formally describe
>>>event data
>>>
>>> On 9/24/12 8:51 AM, "Raffael Marty" <[hidden email]> wrote:
>>>
>>>> What are the exact semantics of the field? An increasing ID? A UUID?
>>>
>>>
>>> This is a good question.  How do you envision this working between use
>>> cases?  We have n collectors receiving events.  For each collector, they
>>> are either receiving CEE-formatted events or not.  In the case where
>>>they
>>> are receiving CEE-formatted events, then what should we expect to see in
>>> such a field?  On the other hand, if events are normalized into CEE,
>>>then
>>> what would we expect to see?
>>>
>>> I would argue that the latter need not be described in too much detail,
>>> but that the former will need some operational support beyond the
>>> specification, unless we attempt a globally unique identification
>>>scheme,
>>> such as using a URI of some sort, which may become burdensome quickly.
>>>
>>> Caveat with what I've said is that I'm dive bombing into the middle of a
>>> discussion - fully aware that there may be information I'm lacking at
>>>this
>>> point.  Mitch is probably closer to the discussion than I.
>>>
>>> Regards,
>>>
>>> Adam
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> --
>>>> Raffael Marty
>>>> ceo @ pixlcloud
>>>> http://pixlcloud.com
>>>> @raffaelmarty                                          http://raffy.ch
>>>>
>>>>
>>>>
>>>> On Sep 24, 2012, at 8:45 AM, Mitch Thomas <[hidden email]> wrote:
>>>>
>>>>> Please consider adding it (back).
>>>>>
>>>>>
>>>>> On 9/23/12 12:53 PM, "Wunder, John A." <[hidden email]> wrote:
>>>>>
>>>>>> Should we consider adding it back? Or if I'm mistaken and CEE has
>>>>>>never
>>>>>> included it, should we add it?
>>>>
>>>>
>>
>>



--
Bill LeRoy


[hidden email]
12
Loading...