Voting period: flat vs. Hierarchical field names

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Voting period: flat vs. Hierarchical field names

Wunder, John A.

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Wunder, John A.

In response to the last message we heard from 3 people who started discussing the issue. I copied what they said below, but I broke out the voting differently this time so if you guys have any clarifications go for it.

 

Anton Chuvakin:

Isn't the an option to only allow hierarchical names for select few fields where it is really, really needed? Otherwise, I am afraid that the complexity will kill it. If we cannot do that, I'd vote for FLAT. JSON/plain text is expected to be used more than XML and so cluttering it with hierarchy is not healthy.

 

George Saylor:

Agreed, there are a precious few things that might need this and any extra is likely a barrier to adoption.  IP addressing is one.  Potentially entities in LDAP and X509 are candidates,  if we can get away with flat it would be better IMO.  

Balazs Vamos:

As I see the hierarchical field representation is only useful when we would like to show the hierarchy. For storing, indexing, retrieving the data, we prefer flat field names.

 

 

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, July 30, 2012 1:06 PM
To: cee-discussion-list CEE-Related Discussion
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

William Heinbockel
In reply to this post by Wunder, John A.
I vote to keep the CEE field names and namespace as flat as possible.
Structure should be allowed and primarily used for extensions (think protocol
header fields) and product-specific fields.


On 07/30/2012 01:06 PM, Wunder, John A. wrote:

> Hey Everyone,
>
> Let's actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.
>
> As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.
>
> In reality this is two separate questions. First, is there a conceptual hierarchy in the data we're trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?
>
> I see a set of four options to answer these two questions:
>
>
> 1.       There is no conceptual hierarchy, so only flat field names are allowed.

This is the simplest, but makes it hard to find custom/extended fields.
Vendors would be responsible for adding field names and hoping that there is no
conflict. Keeping up with new releases of CEE field names becomes burdensome and
requires vendors to continually update their logging (or support some sort of
CEE versioning)

>
> 2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

A vast majority of products use flat structured for logging: syslog, csv,
journald, etc.
Most logs are stored in name-value stores/DB that do not support the structure.

This is pretty much a non-starter for most unixes to support CEE, including Red Hat.

>
> 3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation
>

This allows for the most flexibility, and eliminates any ambiguities between
conversions from structured and flat logs because it is defined by the spec

> 4.       There's a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.
>

At this point, I see no difference between #3 and #4. You have to define the
hierarchy and the conversions.

> As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being "source" and the field name being "ipv4". You'd have further fields under the "source" object, like "port". You could also have "ipv4" nested under some other parent object, like "destination". For each of the proposals above, you'd have objects like:
>
>
> 1.       {"source_ipv4": "192.168.1.1"}
>
> 2.       {"source": {"ipv4": "192.168.1.1"}}
>
> 3.       {"source": {"ipv4": "192.168.1.1"}} OR {"source.ipv4": "192.168.1.1"}
>
> 4.       {"source.ipv4": "192.168.1.1"}
>
> Note that 1 and 4 are fairly similar. The biggest difference is when we're thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it's an important distinction that we'd take into account when evaluating field names.
>
> John
>

--
William Heinbockel
Product Security Team, Red Hat
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Jake Evans
In reply to this post by Wunder, John A.

I’m also in favor of using as flat a representation as possible.  I vote for #1.

Jake

 

Jake Evans, GCIH, GWAPT, GCIA, CISSP, CISA, CRISC, CPISA, CPISM | Sr. Security and Compliance Analyst

 

Tripwire

101 SW Main St., Ste. 1500
Portland, OR 97204  

Direct: 503.276.7658

Main: 503.276.7500

 

TRIPWIRE | TAKE CONTROL.
www.tripwire.com

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, July 30, 2012 10:06 AM
To: [hidden email]
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Guy Bruneau
In reply to this post by Wunder, John A.
I vote for option #1

Guy
Guy Bruneau, B.A., CD
GIAC GSE
Senior Security Consultant
IPSS Inc.
150 Isabella St., Suite 101
Ottawa, On K1S 1V7
Phone: 613-232-2228 Ext: 225
Cell: 613-851-6222
Fax: 613-231-4888
Toll free: 1-866-532-2207
www.ipss.ca

 
From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, July 30, 2012 01:06 PM
To: [hidden email] <[hidden email]>
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names
 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

The information contained in this email message is confidential to the ordinary user of the e-mail address to which it was addressed. This information may be privileged or proprietary to ipss inc. and may not be copied, forwarded, or disclosed to others without the written consent of the sender. If you have received this email in error, please delete it from your system and notify the sender immediately.
Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Rainer Gerhards
In reply to this post by Jake Evans

+1

 

With additional comment: maybe we should specify a standard flat representation of hierarchy (dotteted notation comes to my mind) and require compliant senders to be configurable to emit that format.

 

Rainer

 

From: Jake Evans [mailto:[hidden email]]
Sent: Monday, July 30, 2012 8:54 PM
To: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

I’m also in favor of using as flat a representation as possible.  I vote for #1.

Jake

 

Jake Evans, GCIH, GWAPT, GCIA, CISSP, CISA, CRISC, CPISA, CPISM | Sr. Security and Compliance Analyst

 

Tripwire

101 SW Main St., Ste. 1500
Portland, OR 97204  

Direct: 503.276.7658

Main: 503.276.7500

 

TRIPWIRE | TAKE CONTROL.
www.tripwire.com

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, July 30, 2012 10:06 AM
To: [hidden email]
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.      There is no conceptual hierarchy, so only flat field names are allowed.

2.      There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.      There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.      There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.      {“source_ipv4”: “192.168.1.1”}

2.      {“source”: {“ipv4”: “192.168.1.1”}}

3.      {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.      {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Wunder, John A.

That’s pretty much what I was getting at by #4. Conceptual hierarchy, but only allow representing it in dotted notation.

 

From: Rainer Gerhards [mailto:[hidden email]]
Sent: Monday, July 30, 2012 3:04 PM
To: cee-discussion-list CEE-Related Discussion
Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

+1

 

With additional comment: maybe we should specify a standard flat representation of hierarchy (dotteted notation comes to my mind) and require compliant senders to be configurable to emit that format.

 

Rainer

 

From: Jake Evans [hidden email]
Sent: Monday, July 30, 2012 8:54 PM
To: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

I’m also in favor of using as flat a representation as possible.  I vote for #1.

Jake

 

Jake Evans, GCIH, GWAPT, GCIA, CISSP, CISA, CRISC, CPISA, CPISM | Sr. Security and Compliance Analyst

 

Tripwire

101 SW Main St., Ste. 1500
Portland, OR 97204  

Direct: 503.276.7658

Main: 503.276.7500

 

TRIPWIRE | TAKE CONTROL.
www.tripwire.com

 

From: Wunder, John A. [hidden email]
Sent: Monday, July 30, 2012 10:06 AM
To: [hidden email]
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

william.leroy
Dear CEE Team,

What about the loggers themselves this example from Apache log4j
http://logging.apache.org/log4j/1.2/faq.html

"You may choose to name loggers by functionality and subcategorize by
locality, as in "DATABASE.com.foo.some.package.someClass" or
"DATABASE.com.foo.some.other.package.someOtherClass".

You are totally free in choosing the names of your loggers. The log4j
package merely allows you to manage your names in a hierarchy.
However, it is your responsibility to define this hierarchy.

Note by naming loggers by locality one tends to name things by
functionality, since in most cases the locality relates closely to
functionality."

On Mon, Jul 30, 2012 at 3:09 PM, Wunder, John A. <[hidden email]> wrote:

> That’s pretty much what I was getting at by #4. Conceptual hierarchy, but
> only allow representing it in dotted notation.
>
>
>
> From: Rainer Gerhards [mailto:[hidden email]]
> Sent: Monday, July 30, 2012 3:04 PM
> To: cee-discussion-list CEE-Related Discussion
>
>
> Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical
> field names
>
>
>
> +1
>
>
>
> With additional comment: maybe we should specify a standard flat
> representation of hierarchy (dotteted notation comes to my mind) and require
> compliant senders to be configurable to emit that format.
>
>
>
> Rainer
>
>
>
> From: Jake Evans [mailto:[hidden email]]
> Sent: Monday, July 30, 2012 8:54 PM
> To: [hidden email]
> Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical
> field names
>
>
>
> I’m also in favor of using as flat a representation as possible.  I vote for
> #1.
>
> Jake
>
>
>
> Jake Evans, GCIH, GWAPT, GCIA, CISSP, CISA, CRISC, CPISA, CPISM | Sr.
> Security and Compliance Analyst
>
>
>
> Tripwire
>
> 101 SW Main St., Ste. 1500
> Portland, OR 97204
>
> Direct: 503.276.7658
>
> Main: 503.276.7500
>
>
>
> TRIPWIRE | TAKE CONTROL.
> www.tripwire.com
>
>
>
> From: Wunder, John A. [mailto:[hidden email]]
> Sent: Monday, July 30, 2012 10:06 AM
> To: [hidden email]
> Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field
> names
>
>
>
> Hey Everyone,
>
>
>
> Let’s actually open up the voting period for this: through August 10, 2012
> please respond to this message with your vote (choices 1-4 below, or
> something else) and optionally some discussion or reasoning behind it.
>
>
>
> As a reminder, the topic is whether field names in the CEE dictionary will
> allow hierarchical names or simply flat names.
>
>
>
> In reality this is two separate questions. First, is there a conceptual
> hierarchy in the data we’re trying to represent, regardless of how we choose
> to represent it? Second, assuming there is a hierarchy, do we want to use
> structured markup to represent it or do we want to use a standard delimiter?
>
>
>
> I see a set of four options to answer these two questions:
>
>
>
> 1.       There is no conceptual hierarchy, so only flat field names are
> allowed.
>
> 2.       There is a conceptual hierarchy and we should REQUIRE people use
> structured markup to express it. So hierarchical names are allowed, and MUST
> be represented using structured JSON or XML.
>
> 3.       There is a conceptual hierarchy and we should allow people to use
> either structured markup OR a CEE-specific syntax (like dotted notation) to
> express it. So hierarchical names are allowed, and MAY be represented either
> using structured JSON or XML or a flat, dotted notation
>
> 4.       There’s a conceptual hierarchy, but we should REQUIRE people use a
> flat, dotted notation to express it.
>
>
>
> As an example, imagine the source IPv4 address. You could conceptually
> represent this as a hierarchy, with the parent object being “source” and the
> field name being “ipv4”. You’d have further fields under the “source”
> object, like “port”. You could also have “ipv4” nested under some other
> parent object, like “destination”. For each of the proposals above, you’d
> have objects like:
>
>
>
> 1.       {“source_ipv4”: “192.168.1.1”}
>
> 2.       {“source”: {“ipv4”: “192.168.1.1”}}
>
> 3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”:
> “192.168.1.1”}
>
> 4.       {“source.ipv4”: “192.168.1.1”}
>
>
>
> Note that 1 and 4 are fairly similar. The biggest difference is when we’re
> thinking about field names, whether we consider them as a hierarchy or not.
> It may not affect how the field names are represented, but personally I
> think it’s an important distinction that we’d take into account when
> evaluating field names.
>
>
>
> John



--
Bill LeRoy


[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Bill Scherr IV
In reply to this post by Wunder, John A.
Option #1

B.

Circa 13:06, 30 Jul 2012, a note, claiming source Wunder, John A.
<[hidden email]>, was sent to me:

Date sent: Mon, 30 Jul 2012 13:06:23 -0400
Send reply to: "Wunder, John A." <[hidden email]>
From: "Wunder, John A." <[hidden email]>
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names
To: <[hidden email]>

> Hey Everyone,
>
> Let's actually open up the voting period for this: through August 10,
> 2012 please respond to this message with your vote (choices 1-4 below,
> or something else) and optionally some discussion or reasoning behind
> it.
>
> As a reminder, the topic is whether field names in the CEE dictionary
> will allow hierarchical names or simply flat names.
>
> In reality this is two separate questions. First, is there a
> conceptual hierarchy in the data we're trying to represent, regardless
> of how we choose to represent it? Second, assuming there is a
> hierarchy, do we want to use structured markup to represent it or do
> we want to use a standard delimiter?
>
> I see a set of four options to answer these two questions:
>
>
> 1.       There is no conceptual hierarchy, so only flat field names are allowed.
>
> 2.       There is a conceptual hierarchy and we should REQUIRE people
> use structured markup to express it. So hierarchical names are
> allowed, and MUST be represented using structured JSON or XML.
>
> 3.       There is a conceptual hierarchy and we should allow people to
> use either structured markup OR a CEE-specific syntax (like dotted
> notation) to express it. So hierarchical names are allowed, and MAY be
> represented either using structured JSON or XML or a flat, dotted
> notation
>
> 4.       There's a conceptual hierarchy, but we should REQUIRE people
> use a flat, dotted notation to express it.
>
> As an example, imagine the source IPv4 address. You could conceptually
> represent this as a hierarchy, with the parent object being "source"
> and the field name being "ipv4". You'd have further fields under the
> "source" object, like "port". You could also have "ipv4" nested under
> some other parent object, like "destination". For each of the
> proposals above, you'd have objects like:
>
>
> 1.       {"source_ipv4": "192.168.1.1"}
>
> 2.       {"source": {"ipv4": "192.168.1.1"}}
>
> 3.       {"source": {"ipv4": "192.168.1.1"}} OR {"source.ipv4": "192.168.1.1"}
>
> 4.       {"source.ipv4": "192.168.1.1"}
>
> Note that 1 and 4 are fairly similar. The biggest difference is when
> we're thinking about field names, whether we consider them as a
> hierarchy or not. It may not affect how the field names are
> represented, but personally I think it's an important distinction that
> we'd take into account when evaluating field names.
>
> John
>


STOP SPAM  -  use whitelists

pub   1024D/6382216F 2008-06-20 [expires: 2013-06-19]
Key fingerprint = 5F6A F5AD 1FE0 73CA 2393  A62E 2469 0F95 6382 216F
uid    Bill Scherr IV (Ownership is Vital) <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

dpal
In reply to this post by Wunder, John A.
On 07/30/2012 01:06 PM, Wunder, John A. wrote:

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

If we go with option 1 we need to define a naming convention for things that have multiple values (more than one IP address for example) or multiple representations (full user name vs. uid) which brings us close to 4.
So I view 4 as a 1 with a specific naming convention already partially defined. IMO the naming convention in 4 is incomplete in this case so I would rather say 1 + a more generic naming convention.


-- 
Thank you,
Dmitri Pal

Sr. Engineering Manager for IdM portfolio
Red Hat Inc.


-------------------------------
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/


Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Wunder, John A.
In reply to this post by Wunder, John A.

Just a friendly reminder that you have until this Friday to send in your vote if you haven’t already. Anyone who’s familiar with CEE can vote, no need to be active on the list or even be subscribed to the list. Please just identify yourself and how you’re planning to use CEE along with your vote.

 

John

 

From: Wunder, John A.
Sent: Monday, July 30, 2012 1:14 PM
To: cee-discussion-list CEE-Related Discussion
Subject: RE: Voting period: flat vs. Hierarchical field names

 

In response to the last message we heard from 3 people who started discussing the issue. I copied what they said below, but I broke out the voting differently this time so if you guys have any clarifications go for it.

 

Anton Chuvakin:

Isn't the an option to only allow hierarchical names for select few fields where it is really, really needed? Otherwise, I am afraid that the complexity will kill it. If we cannot do that, I'd vote for FLAT. JSON/plain text is expected to be used more than XML and so cluttering it with hierarchy is not healthy.

 

George Saylor:

Agreed, there are a precious few things that might need this and any extra is likely a barrier to adoption.  IP addressing is one.  Potentially entities in LDAP and X509 are candidates,  if we can get away with flat it would be better IMO.  

Balazs Vamos:

As I see the hierarchical field representation is only useful when we would like to show the hierarchy. For storing, indexing, retrieving the data, we prefer flat field names.

 

 

 

From: Wunder, John A. [hidden email]
Sent: Monday, July 30, 2012 1:06 PM
To: cee-discussion-list CEE-Related Discussion
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Balazs Scheidler
On Mon, 2012-08-06 at 13:55 -0400, Wunder, John A. wrote:
> Just a friendly reminder that you have until this Friday to send in
> your vote if you haven’t already. Anyone who’s familiar with CEE can
> vote, no need to be active on the list or even be subscribed to the
> list. Please just identify yourself and how you’re planning to use CEE
> along with your vote.
>
Like described by Gergely Nagy, we would like nested values, with a
canonical transformation (e.g. dotted notation) between the two.

This way nested attributes can describe a grouping when transferred on
the wire, and at the same time can easily be processed by systems which
use a flat namespace.


--
Bazsi
Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Berry, Chris
In reply to this post by Wunder, John A.

Hi Community,

 

I would like to offer my assessment of the four options and subsequently my vote.   

 

==  Assessment ==

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

 

As expressed by other posters, having a flat structure will mitigate the chances of the community killing this classification system as yet another log standard.  A straightforward classification with a flat structure will interoperate cleanly with many specialized and general purpose databases.    Granted, there might be benefits to having hierarchical structures for select classifications ( i.e., health care logs,  CDRs/IPDR,  etc.), but we must crawl before we can walk.  I would hate to see field name standards wither on the vine because they are too hard to implement.

 

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

 

XDAS appears to be a poster child of good intention leading to limited results.  It was designed to be hierarchical vs. flat, yet in my 9 years working with projects implementing logging systems, I have heard scant references (only from Novell) to this standard by my customers.  I believe we need to avoid complexity to improve the chances of the community adopting this classification system.  

 

JSON supports a flat structure, so there is no requirement for hierarchical names.  

 

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

 

The standard should be consistent, not variable.  Too many options will confuse people. 

 

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

In terms of separators, the underscore character better lends itself to databases in terms of readability  and most *nix folks would be a happy to support it.   Also see my comments for 2 and 3.

 

== Conclusion ==

 

We need to keep this classification system simple and then grow it once it takes roots within the community.   Therefore, I must recommend option 1.   We all know what happened to CBE for log management by IBM – its adoption went nowhere very fast…

 

Regards,

Chris

From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, August 06, 2012 10:55 AM
To: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Just a friendly reminder that you have until this Friday to send in your vote if you haven’t already. Anyone who’s familiar with CEE can vote, no need to be active on the list or even be subscribed to the list. Please just identify yourself and how you’re planning to use CEE along with your vote.

 

John

 

From: Wunder, John A.
Sent: Monday, July 30, 2012 1:14 PM
To: cee-discussion-list CEE-Related Discussion
Subject: RE: Voting period: flat vs. Hierarchical field names

 

In response to the last message we heard from 3 people who started discussing the issue. I copied what they said below, but I broke out the voting differently this time so if you guys have any clarifications go for it.

 

Anton Chuvakin:

Isn't the an option to only allow hierarchical names for select few fields where it is really, really needed? Otherwise, I am afraid that the complexity will kill it. If we cannot do that, I'd vote for FLAT. JSON/plain text is expected to be used more than XML and so cluttering it with hierarchy is not healthy.

 

George Saylor:

Agreed, there are a precious few things that might need this and any extra is likely a barrier to adoption.  IP addressing is one.  Potentially entities in LDAP and X509 are candidates,  if we can get away with flat it would be better IMO.  

Balazs Vamos:

As I see the hierarchical field representation is only useful when we would like to show the hierarchy. For storing, indexing, retrieving the data, we prefer flat field names.

 

 

 

From: Wunder, John A. [hidden email]
Sent: Monday, July 30, 2012 1:06 PM
To: cee-discussion-list CEE-Related Discussion
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John

Reply | Threaded
Open this post in threaded view
|

Re: Voting period: flat vs. Hierarchical field names

Clayton Dukes (cdukes)

I’d like to throw my vote in as well.

 

As a Cisco employee, I work with most of the Fortune 1000 companies out there and I think the biggest hurdle will be adoption.

Coders are lazy.

(I fall under this category, so I’m self-deprecating here, please don’t take offense)

…but the point is that, with the exception of those of us who really care, people writing the code will stick to good ol’ RFC 3164 because they don’t want to do the work.

I still have to really look to find someone utilizing RFC 5424 and it is most certainly more robust.

 

As the creator of LogZilla, my biggest concerns are simplicity and scalability.

When we implement support for this  (and we will), we want our users running the tool  to be able to comprehend the benefit and to use it to their advantage.

If it is too complex, I fear that it would just end up falling back to that “good ol’ RFC 3164” problem again.

We also need to be cognizant of the overhead it will add on large networks – 50k devices already send a ton of data over the wire.

 

 

So my vote is to keep it flat and simple (preferably JSON).

 

 

Clayton Dukes

 

 

From: Wunder, John A. [mailto:[hidden email]]
Sent: Monday, August 06, 2012 10:55 AM
To: [hidden email]
Subject: Re: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Just a friendly reminder that you have until this Friday to send in your vote if you haven’t already. Anyone who’s familiar with CEE can vote, no need to be active on the list or even be subscribed to the list. Please just identify yourself and how you’re planning to use CEE along with your vote.

 

John

 

From: Wunder, John A.
Sent: Monday, July 30, 2012 1:14 PM
To: cee-discussion-list CEE-Related Discussion
Subject: RE: Voting period: flat vs. Hierarchical field names

 

In response to the last message we heard from 3 people who started discussing the issue. I copied what they said below, but I broke out the voting differently this time so if you guys have any clarifications go for it.

 

Anton Chuvakin:

Isn't the an option to only allow hierarchical names for select few fields where it is really, really needed? Otherwise, I am afraid that the complexity will kill it. If we cannot do that, I'd vote for FLAT. JSON/plain text is expected to be used more than XML and so cluttering it with hierarchy is not healthy.

 

George Saylor:

Agreed, there are a precious few things that might need this and any extra is likely a barrier to adoption.  IP addressing is one.  Potentially entities in LDAP and X509 are candidates,  if we can get away with flat it would be better IMO.  

Balazs Vamos:

As I see the hierarchical field representation is only useful when we would like to show the hierarchy. For storing, indexing, retrieving the data, we prefer flat field names.

 

 

 

From: Wunder, John A. [hidden email]
Sent: Monday, July 30, 2012 1:06 PM
To: cee-discussion-list CEE-Related Discussion
Subject: [CEE-DISCUSSION-LIST] Voting period: flat vs. Hierarchical field names

 

Hey Everyone,

 

Let’s actually open up the voting period for this: through August 10, 2012 please respond to this message with your vote (choices 1-4 below, or something else) and optionally some discussion or reasoning behind it.

 

As a reminder, the topic is whether field names in the CEE dictionary will allow hierarchical names or simply flat names.

 

In reality this is two separate questions. First, is there a conceptual hierarchy in the data we’re trying to represent, regardless of how we choose to represent it? Second, assuming there is a hierarchy, do we want to use structured markup to represent it or do we want to use a standard delimiter?

 

I see a set of four options to answer these two questions:

 

1.       There is no conceptual hierarchy, so only flat field names are allowed.

2.       There is a conceptual hierarchy and we should REQUIRE people use structured markup to express it. So hierarchical names are allowed, and MUST be represented using structured JSON or XML.

3.       There is a conceptual hierarchy and we should allow people to use either structured markup OR a CEE-specific syntax (like dotted notation) to express it. So hierarchical names are allowed, and MAY be represented either using structured JSON or XML or a flat, dotted notation

4.       There’s a conceptual hierarchy, but we should REQUIRE people use a flat, dotted notation to express it.

 

As an example, imagine the source IPv4 address. You could conceptually represent this as a hierarchy, with the parent object being “source” and the field name being “ipv4”. You’d have further fields under the “source” object, like “port”. You could also have “ipv4” nested under some other parent object, like “destination”. For each of the proposals above, you’d have objects like:

 

1.       {“source_ipv4”: “192.168.1.1”}

2.       {“source”: {“ipv4”: “192.168.1.1”}}

3.       {“source”: {“ipv4”: “192.168.1.1”}} OR {“source.ipv4”: “192.168.1.1”}

4.       {“source.ipv4”: “192.168.1.1”}

 

Note that 1 and 4 are fairly similar. The biggest difference is when we’re thinking about field names, whether we consider them as a hierarchy or not. It may not affect how the field names are represented, but personally I think it’s an important distinction that we’d take into account when evaluating field names.

 

John