Tutorial :nested insert in mysql for tagging


i'd like to add a tag to a blogpost with a single sql statement.

say my tables would look as follows:

tags  +-------+-----------+  | tagid | tag       |  +-------+-----------+  |     1 | news      |   |     2 | top-story |   +-------+-----------+    tag2post  +----+--------+-------+  | id | postid | tagid |       +----+--------+-------+  |  0 |    322 |     1 |  +----+--------+-------+  

the problem i'd like to solve is inserting a new tag, retrieve it's id and then inset this new id into the relation table in a single sql statement.

INSERT INTO tag2post (postid, tagid)  VALUES  (      332, # the post      IF (          (SELECT tagid FROM tags WHERE tag = 'new_tag'),          (SELECT tagid FROM tags WHERE tag = 'new_tag'),           # here is where i'd like to insert            # the new_tag and return it's id          'i am lost here'      )  )  


You can't do this as a single insert because inserts are atomic--that is, the ID isn't determined until the statement completes.

Wrap both statements in a transaction and you will get your ID, and atomicity.


Instead of using an auto-increment ID column, use a GUID column. Then you can generate the GUID before you run the statement and do everything at once.

Even Jeff Atwood likes this approach, and you don't get a significant speed penalty for using 32-char strings instead of integers.

Also, to prevent tag duplication you should use the MD5 sum of the tag name as the tag ID.


I think you'd have to break it up. One way to do this is to add the UNIQUE constraint to tag and proceed as follows:

INSERT IGNORE INTO tags (tag) VALUE ('new_tag')  

and then

INSERT INTO tag2post (postid, tagid)  VALUES  (      332, # the post      (SELECT tagid FROM tags WHERE tag = 'new_tag')  )  

INSERT IGNORE may not scale well, however. So an alternative is to do SELECT, if the id is not there, INSERT, then be sure to catch an exception here if another thread/process/service modified the table between your SELECT and now. In the INSERT you do LAST_INSERT_ID() and int he catch you repeat the initial SELECT..

Not awesome, but if concurrency is a requirement, you need to do this. Transactions will not alleviate the issue and if you have no UNIQUE constraint on tag, you may end up with duplicate INSERTs.


You could try this, effectively doing an insert if it couldn't select it, and then retrieving the LAST_INSERT_ID() from the inserted row, but I seriously doubt it will work.

INSERT INTO tag2post (postid, tagid)  VALUES  (      332, # the post      IF (          (SELECT tagid FROM tags WHERE tag = 'new_tag'),          (SELECT tagid FROM tags WHERE tag = 'new_tag'),          IF (              (INSERT INTO tags (tag) VALUES ('new_tag')),              LAST_INSERT_ID(),              LAST_INSERT_ID()          )      )  )  


If you use a non-auto increment field and do your own increment calculation, you can determine what the id will be before you insert the field, and specify the new ids manually.

Then you can build a string of statements for each tag2post entry using these ids seperated by ;s and you can send it to msyql to execute.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »