Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Data normalisation / database query

  • 13-06-2012 5:53pm
    #1
    Registered Users, Registered Users 2 Posts: 200 ✭✭


    Hi,

    this is not my area at all, but I have a small project that I wish to complete from start to finish and one part is to store the information in a database. Am trying to get my head around data redundancy and normalisation.

    If we use a student database as an example where every student is assigned a teacher (Mr. Joe Bloggs, Miss Jane Doe, etc.). What is the correct and proper way to store the teacher information in terms of reducing redundant data. Should Title (Mr, Miss, Mrs, Dr., Prof, etc.) i.e. something with only a few options, be in it's own table or is there a more common way of doing this?

    Expanding on that further, suppose each student is assigned a teacher and a mentor (some teachers could also be mentors and vice versa, some may only mentor, others may only teach). What is the "best-practice" way of storing this information (one big staff table, teacher and mentor tables, title table required that mentor and teacher table(s) are linked to)?

    The more I try and figure this out the more problems arise and confused I get. Any advice or pointers welcomed. It is hard to get an "in" on this stuff.


Comments

  • Moderators, Science, Health & Environment Moderators Posts: 9,053 Mod ✭✭✭✭mewso


    This is a huge area and some of it will be down to personal preference. On the question of title a normalisation purist might insist on a separate table but I personally wouldn't bother with a small rarely changing list of options. An alternative would be to determine title in other ways like having Gender, MaritalStatus, HasDoctorate and so on as columns that would allow your UI to display the relevant title while also storing information that would be of use in other areas.

    Your second question is again something I would do a certain way that others might not but in essence I like to bring the idea of inheritance into these scenarios (I most often develop objects in code well before deciding how to store them) and we have some obvious common factors between students, teachers and mentors. They are all people so I would have a people table containing the basic info. like name, age and so on. Then Student, Teacher and Mentor tables will have one-to-one links to this table with the information specific to each contained in their own table. Then you'll have the StudentsTeachers and StudentsMentors tables to link students up to teachers and mentors.


  • Registered Users, Registered Users 2 Posts: 586 ✭✭✭Aswerty


    Mewso has given a solid answer.

    I might just add that linking the students with mentors and teachers is done in the StudentsTeachers and StudentsMentors tables by having a column of the student primary keys and the Teacher/Mentor primary keys. Joins can then be used to link the tables together when querying the database.

    Also with regards to the question about the titles as Mewso said it's not that important to push small pieces of data like this into their own table but I think when you have the option to do it you should since I'd expect there to be small performance gains.


  • Registered Users, Registered Users 2 Posts: 2,781 ✭✭✭amen


    This is a huge area and some of it will be down to personal preference

    true
    On the question of title a normalisation purist might insist on a separate table but I personally wouldn't bother with a small rarely changing list of options

    Disagree. You should really have a table called refTitles which contains a unique ID and a Title column so you would have
    [PHP]
    Table refTitles
    ID Title
    1 Mr
    2 Miss
    3 Mrs
    4 Dr
    [/PHP]

    In your Teacher table you have a column called TitleID which refers to the ID in the refTitles. This way if you need to change a spelling, add a new title, support localisation its very easy to change.
    An alternative would be to determine title in other ways like having Gender, MaritalStatus, HasDoctorate and so on as columns that would allow your UI to display the relevant title while also storing information that would be of use in other areas.
    Using other columns to determine a Title is a bit of a nightmare. You shouldn't have to use logic to get a simple title. What happens if someone has a PhD (HasDoctorare is selected) but they want to be know as Mr? What happens if someone is married and goes by their maiden name and you start addressing them as Mrs Maiden name ? Actually using Marital status to determine Title is tricky what would you do for a male ? Do a Sex lookup as well as Maritial status to determine Title?


  • Moderators, Technology & Internet Moderators Posts: 1,336 Mod ✭✭✭✭croo


    The way I have usually seem these lists implemented (in databases) is with them grouped into a separated table(s).

    So with might have a table called Enumeration that contains a record for Mr, Mrs etc...
    But we also have a table EnumerationType and we can tie these two together so that we have

    EnumerationType
    id, descr
    1, Salutation
    2, Role
    3, Gender

    while the Enumeration table would have
    id, enumTypeId, enum
    1,1,Mr
    2,1,Mrs
    3,2,Teacher
    4,2,Mentor
    5,2,Student
    6,3,Male
    7,3,Female

    Then ALL the various enumerations in the application would be stored here and so, in theory, modifiable by the users without need to change code.

    It might seem like there would be a performance overhead but the reality is this table is so frequently used that the data is usually in the cache so the real overhead is minimal.

    There are more complex possible structures this might take but this should suffice for simple needs.


  • Moderators, Science, Health & Environment Moderators Posts: 9,053 Mod ✭✭✭✭mewso


    amen is correct to point out the problems with my suggestion re. gender etc. but in my usual badly worded style I am simply trying to encourage a thought process that seeks out the best solution for a specific scenario. I was not saying that using gender, maritalstatus and so on was the best way to go about it just thinking out loud.

    I'm not a big fan of absolutes hence why I would not lay down a golden rule to use a foreign key relationship for the likes of a title but it's absolutely a perfectly valid solution and I take the point of localisation if it needs to be a consideration.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 200 ✭✭druidhill


    Thanks for all the replies so far - it is good to get feedback from people who are in this area. I'm still trying to absorb the points that everyone has made, so I'm sure I'll have follow on questions.

    Thanks for a good discussion.


  • Registered Users, Registered Users 2 Posts: 2,781 ✭✭✭amen


    Discussions are always good. It's how we learn


  • Closed Accounts Posts: 18,056 ✭✭✭✭BostonB


    There are also reasons not to normalise.

    http://databases.about.com/od/specificproducts/a/Should-I-Normalize-My-Database.htm

    Too often do I see databases normalised too far, where its far more complicated than it should be, usually making it much harder to develop against, support and can have a negative impact on speed. Normalisation should be helpful. It should also mirror real life where possible. I like the idea of planning the application and objects, and use cases first. Then designing the database to suit. I often see a databases normalised in a way that isn't helpful.

    However I suspect this is a college project. As such they may not be looking for practical solutions, but the application of the theory.

    In the example of a student database, I think theres some logical (conceptual) relationships missing here. There would be the concept of a teacherID and student ID. ClassID, and perhaps termID, yearID, moduleID. You might have a staff table that has details like are they active, or not, what subjects they do, what level they teach at, what level of access they have in the application, are they admin or not for example. But wouldn't include their name or address'es etc, or email, that would be in staff details table.


  • Registered Users, Registered Users 2 Posts: 200 ✭✭druidhill


    Your suspicions are incorrect BostonB - it's not a college project, it's a work thing that I would like to take the paper work element out of. It's something that I think would be good for me to develop an application from start to finish.

    There will however as you mentioned, be an element of certain users having admin access to the application, as well as a "superadministrator" role.

    It is interesting to see that even with only a few replies, there is a lot of variation in how people would choose to handle this.


  • Registered Users, Registered Users 2 Posts: 5,246 ✭✭✭conor.hogan.2


    druidhill wrote: »
    It is interesting to see that even with only a few replies, there is a lot of variation in how people would choose to handle this.

    DB work is full time jobs for lots of people, opinions in this field are endless nearly even in RDBMS alone.


  • Advertisement
  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    druidhill wrote: »
    Expanding on that further, suppose each student is assigned a teacher and a mentor (some teachers could also be mentors and vice versa, some may only mentor, others may only teach). What is the "best-practice" way of storing this information (one big staff table, teacher and mentor tables, title table required that mentor and teacher table(s) are linked to)?
    One approach would be to simply have a single teacher/mentor table that includes two boolean fields for 'teaches' and 'mentors', which would allow you to easily record those who are both. Of course, that does open up to the possiblity of someone who does neither, which may actually be something you don't want to have - or do.

    Another approach is a lookup table with all the possibilities that you will allow are catered to. For example:
    1 Teacher who also Mentors
    2 Mentor who also Teaches
    3 Mentor
    4 Teacher
    
    Of course, a question I could ask is that is teacher/mentor imperially identified as such or these simply roles? By this, if someone who does both are they principally a mentor or teacher, who just happen to also do the second job as well.

    If so, the second approach would work better, as would a third one where the teacher/mentor table would include primary and secondary role fields, linked to a simple lookup table such as:
    1 Teacher
    2 Mentor
    
    In this case, where a teacher/mentor only fulfill the primary role, then the secondary role field would remain zero.

    A fourth approach is to use a many-to-many relationship; roles table (as above), teacher/mentor table with no additional fields and a lookup table between them. This last table would have two principle fields; for the ID of the teacher/mentor and the ID of the role. This means that multiple roles can be added to a teacher/mentor, each with its own lookup table record.

    All in all, I can see both advantages and disadvantages in all of the above, but almost always related to future proofing. That is where requirements may change in the future, such as a new role being introduced, such as an administrator, and that a teacher/mentor/administrator may be able to fulfill up to three roles at a time.

    But as this is just a project, you should not have this problem. Indeed, if the purpose is to show your understanding of relational databases, then I'd be tempted to go for the fourth approach - although it probably would be an overkill otherwise.


  • Registered Users, Registered Users 2 Posts: 9,568 ✭✭✭DublinWriter


    Cleanest way I could see would be a students table and a teachers table.

    The student table would contain two foreign keys to the teachers table for assigned teacher and assigned mentor.

    It's normally best to put things like titles into a table as leaving it freetext in the teachers table could lead to weird and wonderful spellings, making it potentially difficult in the future if someone wants a list of all staff who are PhDs.


Advertisement