Ordinality on reference fields

robertDouglass's picture

CCK is becoming quite useful and living up to its promise. The node and user reference fields are not terribly useful yet, though. The Add and Reference and Form Restore modules by Artem Timofeev are great additions to the interface aspect.

By far, however, the largest issue that limits the real use of the node and user reference fields is their lack of ordinality. It is impossible to say that the relation is 1:1, 1:n or n:m (if we were really daring we'd implement an inheritence reference as well). For example, I'd like to use CCK to have a database of composers and the pieces they've written. Assuming I could work out all the interface and workflow issues that make it rather clunky to catalog the complete works of Mozart and Telemann using these tools, the real knockout criterium that prevents this from working is that there is no way to say that a piece can only be referenced by one composer (1:n). What good would the database be if it were possible to say that both Beethoven and Stravinski wrote the moonlight sonata?

So I'd like to hear what thinking others have done on the issue and who, if anyone, has any concrete plans.

Part of me thinks that the implementation of references as fields is misguided and that the whole relationships/references thing should be its own system. After all, a bi-directional reference that enforces ordinality rules doesn't really belong to just one of the involved nodes, but rather both. Or, perhaps it doesn't belong to either of them, but they belong to it. Maybe we shouldn't be sticking reference fields in node types, but rather creating reference node types (or reference types) that get configured with a left and right node type and ordinality rules.

Comments

Once again: how about the Relations API in core?

tema's picture

Thank You Robert!

So many things was happened since I had the dare to publish my modules. Multipart forms API was commited to HEAD, so my Form Restore module becomes useless, Add and Reference - outdated.

IMO node relationship is important enough to develop separate API for it. As I see the Blessed Trinity of document's framework is

  1. content construction API (CCK)
  2. custom queries API (Views) and
  3. relationship API (Relationship[?])

The last is the protracted building of dman, I've watch over it's progress for a long time but cannot use it because of the instability. Despite everything, this module is huge sphere of action, IMO it's very promising.

Another prospective modules was rised by fago: Pageroute, Node Family, Node Profile. It's interesting to learn of his attitude about node relations and about the future of CCK.

Today's relationship functionality based on CCK nodereference makes possible to use it right now for realworld project but as mentioned it have a lack of logic and usability. I wish Drupal core developers would acknowledging of our need for more reliable and forward-looking relationship API.

Protracted, indeed.

dman's picture

To call my efforts unstable is being generous :-B

To address the theme of the OP here, my thoughts are :
Cardinality constraints as defined by (or at least discussed by) OWL are a useful tool in the relationship-making business.

I've not looked into it deeper than acknowleging it as a nice-to-have somewhere down the track.
When implimenting logic, however (cascading or implied properties, inheiritance and reciprocals) enforcing the constraints will take a heck of a logic engine. It could get messy, but it's certainly worth trying.

I have to agree that the 'relationship' links live best outside of node items themselves. But I also have found that the two ends of a link are independantly important. There is a tiny but (I think) significant different between the statement "Hayden composed Concerto #12" and "Concerto #12 was composed by Hayden".
Differentiating the subject from the object is more than just calling it the 'left' and 'right' term of a link. It tells you a little of what the statement means.
"Josh likes Rock Music" is a more specific or memorable statement than "Rock Music is liked by Josh"

On one factual level, yeah, they are equivalent, but on a storage/maintainence level, this link is edited in Josh's profile - not as part of the definition of rock music.

Thus, bidirectionality is sometimes a given (eg 'sibling'), sometimes implied ('contains/containedBy) and sometimes has very little weight at all.

Despite the moves towards 'everything is a node', the earlier implimentation of this (node_relation IIRC) seemed to pile up lots of awkwardness without giving much utility in return. I don't like the look of going in that direction again.

Perhaps I'll fire up the latest incarnation of CCK this weekend and see if I can come up with a custom widget that will bridge the gap between my protracted project and that.

.dan.

I love the promise of RDF

robertDouglass's picture

Dan, you know I've followed your work with RDF for a long time. I would love to have an engine that could express relationships as richly as RDF can. Thus I'm very supportive and enthusiastic about moving your stuff closer to the burgeoning world of CCK+Views. The thing that worries me is not the technology, its suitability, its feasibility etc., but rather the fact that nobody besides you understands what the hell your code does. I don't think it's because your code is hard to read... I've looked, and it is not. It seems that not many people understand the underlying technologies. I still don't and I've been trying (I even bought the O'Reilly RDF book... can't really recommend it).

One of the reasons Views caught on was that people already grasped the underlying goal (build queries and render the results). Same with CCK (it's the new flexinode). So how do we bring more developers into the loop with your work? Neither Views nor CCK would be half the great things they are if they had been the effort of one developer (though it took one or two visionary developers in each case to build the foundation and blaze the trail). People, how can we help Dan out?

Linking Relationship and CCK

KarenS's picture

Dan's last comment about trying to create a CCK widget for the relationship module (if it's possible) is IMHO a step in that direction. I can tell you that there is lots of interest in the nodereference module, so people understand that concept and want it. The Relationship module is kind of daunting because it is so rich (and not everyone needs all that capability), but nodereference is simple and easy to grasp. So maybe linking them together is the real key, if it's possible. Maybe offering a way to wrap around or extend the nodereference module would work -- i.e. I can use nodereference to create simple relationships. If I later decide I need more than that, the Relationship module can somehow use or extend or upgrade from the relationships I already created (so I don't have to lose all that work) and add in additional functionality.

This does not solve the

sun's picture

This does not solve the issue that noderefences aren't real references. This patch is totally messed up by this condition.

As dman described here I'd rather like to see the full blown Relationship API provided as basis for easy but at any time extensible nodereference fields in CCK. This would allow total flexibility/scaling and at the same time a better recognition/faster development of Relationship. Also no dupes. I don't think we would need any additional configuration fields elsewhere - just leave current node-/userreference field UIs as they are.

But what we really need is a stable subset of Relationship module allowing simple "Child Of" relations for nodereferences. These have to be somewhat "untouchable" while other parts (i.e. the real/major API) are developed actively in the background. Step by step additional functions are activated to the public audience (suggesting a Debug mode for development). Additional CCK fields may be created or nodereference field extended. dman also would have to clean up and document minimum parts of the current module now.

dman, what do you think about that?

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
unleashed mind

+1

tema's picture

very rational proposal!

Yes, +1. I'd like to see a

cooperaj's picture

Yes, +1. I'd like to see a greatly simplified Relations API in core and the ability to hook into that with CCK. I've been playing with Relationship.module and the power it offers is just amazing. It more then equals the power that the current taxonomy system can offer. But it's just so huge. What I'd like to see is the equivalent of the current ver 5.0 stripped down cck in core:

  • Two or three shipped predicates, with *no* others.
  • The ability to create a predicate. The form should have a single text field where I can enter 'is liked by' and a dropdown to choose the entity (Node, User etc)
  • An ajaxy interface very similar to the current upload.module.
  • and finally...
  • A backend full of beautiful hooks where we can add in all this foaf-rdf-schema-import-export malarky.

Once this is in place the CCK noderelations field would be a doddle and would become incredibly powerful.

cooperaj, seems like you are

sun's picture

cooperaj, seems like you are more in-depth with Relationship module. Would you mind to help dman out?

I don't think that cloning/forking of Relationship module is a good idea. Instead we should expose those basic functions of Relationship module by working out a basic documentation of (just) those needed functions. In my last test of Relationship the module offered me to install a basic set of predicates (exactly for those child-of and parent-of relations). If we could just find a way to leverage these from CCK nodereference, the first step would be done. I.e. provide a simple hook for nodereference (and possibly others) to relationship for inserting, updating, deleting and retrieving child-of relations. CCK nodereferences are just those child-of relations currently.

Any other things can be kept in Relationship and there are no further modifications needed in short-term. Relationship would just act as a dependent module for nodereference, providing that basic subset of child-of relationships. Relationship module does nothing to your site and forms if you don't enable extended features.

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
unleashed mind

See what we can do.

dman's picture

The simplest relationship - relation - can come along with it. It's faded in and out of my schemas. I removed it at some point because it seemed singularly pointless, but now I've implimented class inheiritance in the definitions, it should come back.

Unfortunately, I've built a lot of the structure on assuming the structure is there. The very definitions of predicates are relationship statements themselves. The shipped predicates are legion (I know :-( ) but recently I've mostly hidden them from the interface.

Creating a predicate is what it's all about, and has been there since day 1. However to simplify it right down, I do need a wizard-like form to hide the rest of the guts.

The hooks (relationship_api.inc) are plentiful. Non-obvious to see how to use them all, I admit, but I've kept those bits - the ones intended for public access - there in the one file, made them paranoid about input assertions and really padded out the docs.

I had some progress with the CCK angle last weekend, BUT in that I don't see the relationships as part of the node data/storage themselves, the data can't very well live inside the CCK table. I came a big circle and refined the node-edit and content-definition screens again.
Although I could get a widget that worked fine with CCK nodes, it was just generic, and not at all CCK-specific.
The remaining bit is I guess to hook into the CCK definition process and hide the complexity of defining domains and ranges (which is what it's all about I guess)

Goals

sun's picture

Some thoughts about that:

  • You're right, it won't work to have relationship data inside the CCK table. Resulting problems of this are the cause for incorporating Relationship with CCK nodereferences.
  • Although it would be semantically correct, I'd say that it wouldn't be required to have RDF information in CCK nodereferences. That would be the cherry on top, but not necessarily needed.
  • I'd rather see CCK nodereferences as some kind of child-of/parent-of relations, but without necessarily having relations between humans, they are more generic child-of/parent-of relations. I've to admit, I don't understand quite much of RDF yet. These predicates should come with CCK nodereference and users shouldn't need to install or configure them to use CCK nodereferences.
  • A CCK nodereference currently is an AJAX widget that allows to select a node from a previously configured list of node types (defined separately for each nodereference field). We should concentrate on leaving this widget as is, but replacing the underlying nodereference functions with Relationship API functions. The resulting method should allow an upgrade path for people already using nodereferences (move nodereference data to relationship and drop nodereference columns from CCK node types).
  • That said it should be mentioned that CCK is on the way to display referring nodes (backlinks) in each node that is referenced by another node.

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
unleashed mind

Yo, I be back

dman's picture

I've now enabled my subscription to these groups - which I didn't have on before so I tended to miss bits...

I agree and admit that the project as it stands is a behemoth, and daunting code to start on.
A 'relationship lite' interface is probably a good idea, but that's the concept of an API in the first place. As it is, the admin side is already in its own files, the API+core as illustrated here can be called on without either the complicated admin UI or even the widgety (not actual AJAX yet) node edit UI being needed.
Um, bootstrapping however does now require the RDF library, as the structure of predicates (relationship types) is now defined/distributed in files like that

Relationship Lite (code-name One-Night-Stand)

Soo...
I'll boil up a minimal install, which will (have to be) totally compatable with the big version. However, as I've found with other APIs, it doesn't leave much left to look at. My big UI efforts have been trying to deliberately expose the guts.
However, fago recommends no UI at all... That's easy, but hard to illustrate to anyone who wants to play. There's a PILE of docs in the code. ... I'll see what publishing the API-doc does to it again, it produced a mess last time I tried.

what makes nodefamily different? no UI

fago's picture

Imho the best UI for node relations would be: no UI.

This is why I built the nodefamily module. It tries to determine automatically the relation between nodes, so there is no need for any user interaction to set the appropriate relation.
I know that this isn't possible everytime, however in some cases it is. Currently nodefamily uses configureable relations between content-types to set relations between single nodes. It sets the relation between for all nodes with appropriate types of the same author.

By restricting the nodefamily population of the content types (=max. number of nodes of this type per user) it's possible to define 1:1, 1:n, n:n,.. relations.

I've already thought about using pageroute to determine which nodes belongs together. Nodefamily could set the relation for all nodes, which have appropriate types and are created during the same pageroute. However this would require some extensions to pageroute, too. Because then it has to be possible to add and edit different nodefamilies by using the same pageroute. Currently it is restricted to editing 1 nodefamily / pageroute or you use the route for node creation only.

In my opinion this would be the userfriendliest way. Create the nodes through the pageroute. Edit the same family of nodes through the route again..

Relationships and views

KarenS's picture

I agree that the node and user reference fields have limitations and may or may not be the best way to go, but one advantage they have is that they are fields that can be pulled up in views, which makes them very powerful. I use that ability extensively to, for instance, display on an event node a views view table of related nodes.

I haven't looked at the relationship module lately (the last time I looked it hadn't been ported to 4.7), so maybe it is integrated with views already, but I just wanted to mention that any solution should have views integration built in.

validation hook

moshe weitzman's picture

seems like the example you gave about a given piece being referenced only once is easily solved with a validation handler on the node form. node reference field type could offer this feature if it wanted to.

What we've done right

robertDouglass's picture

Ok, here I go; my favorite aspects of the various modules that currently exist.

Relativity

One thing that the Relativity module does really well is let you configure the ordinality between nodes as well as how the interface should work. Yes it could be made better, but he solved all the underlying problems for an initial working module, and that is worth some studying. It is straightforward enough to say that node type X can have many node type Y as children, but that node type Y can only have 1 parent X.

Relativity also solves the interface problem by adding a link to create a new related node or choose from a list of related nodes.

Relativity initially tried to solve the Views issue (it predates views) by having a relativity query node type. On June 20, darius added Views support.

It's biggest weakness is that it only provides a way to link between nodes. There is no meaning to the relationship. It all boils down to a left-right duple.

Node Relationships

Dan's RDF module. It looks like it could do most of what we want and then some. It looks like it is suitable to leverage the wonderfully rich ontologies that already exist. It looks like cardinality and semantic meaning in relationships could become the norm.

Dan has provided tons of documentation:

Shortcomings

  • Not based on or integrated with CCK
  • No Views integration
  • You really have to understand the underlying XML technologies to effectively use the module
  • Dan lives in a remote corner of the Earth and doesn't give presentations at Drupalcons
  • historically unstable
  • Dan's the only one who knows what it does, where it could go, and how valuable it could be

CCK user and node reference fields

The good:

  • Quasi-semantic (the field has a title which conveys semantic meaning... better than Relativity in that sense)
  • Works right now
  • gains value because of being part of CCK
  • Views integration

The missing:

  • No real semantic meaning (in comparison to Dan's Node relationships)
  • No cardinality constraints or bidirectionality

The ugly:

  • Interface issues

I wish I could present the

dman's picture

I wish I could present the stuff better. I keep starting a little tutorial or something, then get a brainstorm or discover a small bug and go off and fix it for an hour or two.
As such, I've never been able to properly communicate what this is supposed to (and sometimes even can) achieve.

I had a go at CCK and views this afternoon. CCK looks pretty easy to work with, but views is a bitch for me to get at from my direction. It relys on all data being stored in direct lookup tables, mostly indexed on nid.
My efforts towards a metadata query language have been following the W3C SPARQL and similar. This syntax does not have a 1:1 relationship with SQL, and I can't see how I can twist the cryptic views hook to respond to my info.

So, although I can currently say
"give me all the relationships that node 12 has" or
"give my all the child relationships that node 12 has"
- that data returned may have been deduced by a logical chain of lookups :

  • node 12 is also known as 'Adam'
  • node 27 has a Parent called 'Adam'
  • 'Parent' is the inverse of the relationship 'child'
  • => thus, one of the child relationships of node 12 is node 27.

This works, and is what I wanted to happen, but feeding that data back to 'views' (which expects to be talking native SQL) is looking pretty hard.

CCK, OTOH, is simple, all the data gets loaded into the node on node_load.
I'll be able to publish the configuration to the CCK interface pretty easily by the looks of things. I've already hooked into the CCK node type configuration form before now.

Yes, although I've worked on it lots, it's usually unstable. And big. And damn hard to see the need for what looks like all the complexity in there when all you want is to establish a simple 'relationship' link.
It's due to my attempts to live up to the groundwork done by the DC, W3C and the rest in providing these long-winded definitions of semantics, and at the same time leave it all to be extendable as other criteria (like this cardinality thing) to be added naturally.

I'd like to hide most of the magic and come back to an interface that's as simple as adding a CCK field. Adding a new widget seems the easiest way to do that. I can certainly do at least as good (bad?) as the current nodereference ui. All I would do is add more options ... and thus complexity :-/

.dan.

views fusion

fago's picture

I've already worked on views support for the nodefamily module. For this I've created the module Views Fusion.

With views you could do simple filtering based on nids like this: Give me the parents of nid 4 - if you have appropriate tables. However as soon as you need to join the node table twice you can't do it with a usual view, e.g. this wouldn't work:
"Give me the title of all nodes, which are parent of the node with the title "Adam"

This is where views fusion comes in. However you have to create a view for each kind of nodes, then you "fuse" the views by telling the views fusion module which node relation it should use for fusing.

So for the above example you would create one view, which lists the node titles. Then you create another view which filters for nodes with the title "Adam". Then you would use views_fusion to fuse them.

patch advertising :-)

yched's picture

As for cck's nodereference interface issues, I'd be a fool not to mention my own effort over there : http://drupal.org/node/78825

Still needs a little cosmetic love IMO, but the idea is here and working.

(edit : I just saw Karen commented about it in another related thread :-) )

I'm thrilled to see this

robertDouglass's picture

Thanks!

Constraints

robertDouglass's picture

In addition to ordinality, the other feature I miss in the user and node reference fields is the ability to provide constraints. For the sake of a trumped up example, let's say I had a "person" node and I wanted it to have "brothers" and "sisters" node reference fields. Both fields would reference other "person" nodes, but you clearly wouldn't want the "brothers" field to be able to reference female persons. So there needs to be a way to constrain the set of potentially referenced nodes. Would it be possible to make a Views based constraint for the reference fields? You'd set up a View (all person nodes with gender = female), and then configure the "sisters" field to only offer nodes returned by that view as potential link candidates.

Ok, this is scope creep for the thread here, but it is something we definitely want to discuss.

Using views to select nodes

KarenS's picture

Have you seen http://drupal.org/node/78825? Yched has created a way to use a views view as a selector for the nodereference field.

that's the patch

drewish's picture

robert, that was the patch i was refering to in our conversation. i think by combining that with a view that pulls an argument from the URL and we could hack something together.

importexportapi helps a bit

Jaza's picture

I've noticed this thread hanging around for a while, but I haven't looked at it before now. Anyway, I thought I'd chime in with a bit about the new Import / Export API module that I've written (for the Summer of Code '06), and what it has to offer in the relationships / references / ordinality sphere, and how it compares to some of the other modules that have been discussed in this thread.

The API is based on a data definition system, which consists of nested fields that generally map fairly directly to Drupal's database schema. There are three main types of fields: an 'entity' is a top-level field for 'objects' in Drupal (e.g. 'node', 'user', 'term', etc); an 'array' is a field within an entity, or within another array, that holds a set of values that are the children of their parent entity or array; and all other fields (let's call them 'regular' fields) are flat values, e.g. 'int', 'string', and are children of either an 'entity' or an 'array'.

'regular' fields can reference any other 'regular' fields. These references are implicitly either 1-1, 1-M, or M-N, based on the positions of the respective referencee and referencer fields in the definition structure. For example, 'role.rid' has an implicit 1-1 reference with 'role.perm_rid', 'node.revision.nid' has an implicit 1-M reference with 'node.nid', and 'node.taxonomy.tid' has an implicit M-N reference with 'term.tid'. The API's bundled engines work with these implicit relationships when doing imports or exports. However, I don't know how hard it would be to actually parse the definitions, and to return a clear list of relationships and ordinalities - the referencing system is probably not ideally suited for this.

I think it would be fair to say that this system is quite 'crude' compared to other solutions, such as Dan's relationships API. It's based on allowing for automated import and export (i.e. automated query generation and automated text file manipulation) by representing the DB's schema as accurately as possible in code. It's not based on any existing standards or on academic theory (e.g. RDF, ontologies), it just sort of 'came together' to achieve the purpose that it's built for. And, of course, it doesn't allow for any semantic meaning in relationships, such as the predicate system in the relationships API (e.g. 'parentOf', 'relatedTo'). It just allows one field to reference another field, and it allows for an implicit ordinality between those two fields, based on their relative positions in the definition structure.

Anyway, just some food for thought, and just one more solution that's out there and that's doing its best to solve the ever-present relationships problem in Drupal.

Jeremy Epstein - GreenAsh

Jeremy Epstein - GreenAsh

RDF integration with CCK

hendler's picture

I'm also working on a project which is integrating CCK into an RDF store called NINA. NINA is just the name of the sponsor for the code. But really it's a general purpose tool. I have been reluctant to discuss too much in the Drupal community until I contribute to CVS.

Anyway, Node Reference and providing the constraints Robert is talking about is getting to the core of what is juicy about RDF. Node relationships overlap a bit with taxonomy categories to an extent and so I haven't decided if to use them for much. But when you want to describe arbitrary "relationships" between nodes there is still a lot of potential there. Views may not take advantage of these arbitrary relationships well, but that's why I want RDF inside of Drupal.

One workaround is to create multiple types of node relations that fit, if not enforce, your data model.
If there is a hierarchy, you may not be able to enforce it, but you can have one node relation field in CCK that is "dad" and another which is "grandad" - make sense?

Some Background:
I may have met some of you at drupalcon.
Both Dan above and myself work with ARC - a nice PHP RDF store. Ben Nowack - ARC's creator, knows both Dan and I through our projects. I had a few emails with Dan back in February. I come from the RDF world (some academia) and have been learning Drupal in depth only for 4 months or so.

Content Construction Kit (CCK)

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week