How many candidates request that golden interview hour of 11am? It’s not early for any party to feel sleepy or hungry; it’s not too late in the day to feel the fatigue of work. How many applicants vie for Tuesday or Wednesday slot? As explained by Glassdoor, the well-known job review site, shows that Tuesday is the optimal day for an interview.
Whether recruiters confess it or not, there are simply certain times and days that are more optimal than others.
Hunger, sleep, work stress, and time of day affect the attention span of an interviewer.
According to research done by Monster Job Boards, interviewers take just 385 seconds to decide if the candidate is right for the role. Monster.co.uk talked to 273 managers. Participants qualified the importance of things like quality of small talk (60 percent agreed), strength of handshake (55 percent), and ability to hold eye contact (82 percent agree). While these are important, a recruiter only has a limited amount of time to make a correct conclusion about a candidate.
We’ve worked hard to ensure results are processes and parameters are clear. Retorio's measures are based on validated scientific consensus.
What is reliability and why is it important?
Reliability describes the extent to which the results can be reproduced when the research is repeated under the same conditions. It’s verified by checking the consistency of results across time, across different users or observers, and across parts of the test itself.
Think testing for reliability akin to baking a cake:
When a recipe calls for a certain number of ingredients, a certain temperature, and a certain process, a delicious cake is to be the result every time. Save adjusting for altitude or a quirky oven, the cake can be easily “reproduced”. Quite a few AI video interview software offer personality assessments. However companies not offering high reliability or transparency on what their data points are searching for may have missed the mark on ensuring objectivity.
The basis for reliability is objectivity, which is defined by 3 factors.
Historically, this has been a person, a teacher or a professional proctor. Then computers became the conductor in driver’s license tests to learning disabilities assessments. Its key the proctor is independent and follows a standardized approach and process. Remember as a kid taking national exams and your teacher reading out a set of instructions? It’s about standardizing as many variables as much to produce an unbiased outcome. With Retorio, candidates easily log-in to an account with a direct layout with standard questions created by hiring managers. It’s a marriage of standardization and personalization in this fast-moving world of work.
This second part of creating objectivity is ensuring the evaluation criteria is the same for every candidate. People should be judged by the same metrics. Job testing should equal to all applicants, for example. The assessment is standardized, investigating the same variables.
This third pillar of objectivity relies on the fact results will be independently interpreted. It’s arguably the most important. Humans often possess clear and/or inadvertent influences, which may affect results. Lack of sleep, hunger, attention, or an ulterior motive may influence how a result is interpreted. Creating a mechanism to ensure objective analysis and interpretation of an assessment is key. Computers grade a range of assessments, from driving tests, student essays, to legal papers. Depending on the program, it can be a more objective methodology than the standard human reviewer.
Factor affecting the measurement is actually measuring the personality trait and not the current mood (validity). “How are you feeling?” is not an appropriate question as its subjective.
Objectivity precedes reliability.
According to quantum physics, the mere act of observation can completely change the outcome of an event. One of the most famous experiments in physics is the double slit experiment. It demonstrates, with eerie strangeness, that the very act of observing a particle has a dramatic effect on its behaviour. The act of looking at electrons makes them act like particles, rather than waves (when they’re not observed). Unless an assessment or its process is objective, it will fall under the influence of bias. That’s why having an independent proctor is vital to conducting as personal as a personality assessment.
A computer software program standardizes the process, not only the interpretation. To avoid such bias, personality assessments should be conducted, evaluated, and interpreted by an unbiased tool, like a computer program.
In Retorio’s own objective self-composition, it provides a strong base for reliability.
The variables being measured must be stable in themselves, otherwise results will be useless. Personality versus mood is one example. Overall, personality is rather stable. It may change over a lifetime, but basic logic, feeling, and decision-making patterns generally stay the same. What changes often and quickly is mood. A person may feel excited in the morning and tired and needing to withdraw in a few hours. Mood is also dependent on rapidly-changing variables like, hours of sleep, hunger, or simply a bad day. Retorio’s AI measures the stable dimensions of personality, which don’t tend to change often or rapidly.
At Retorio, we’ve transferred the standard Big 5 data points psychologists use to a digital format. Our AI-enabled personality test includes these thousands of data points. This is roughly similar to how traditional book keeping was digitised into Excel or how taxes can now be filed through an online software without a tax consultant.
Measuring the reliability variable
The Reliability Coefficient is the number quantifying the degree of consistency. With a number, we can examine whether an outcome is consistent. A low figure may be due to a poor testing environment, a small number of participants included, or test design errors. For AI video interviews, calculating the reliability coefficient helps us gauge potential errors in testing.
Types of Reliability
A meta-analysis was conducted to measure the Big Five’s retest reliability. The study reveals the 682 test–retest correlations collected within an interval of up to two months from 74 samples (total N = 14,923) across different measures of the Big Five. The median aggregated dependability estimate for the five traits was ptt = .816. Extraversion scales resulted in the most dependable scores, whereas agreeableness scales exhibited slightly larger measurement error. Meta-regression analyses indicated small moderation effects of the chosen retest interval for three traits, with shorter intervals resulting in higher retest correlations. An assessment tool that reliably gauges personality, can be used to assess an applicant's personality traits, skills, and improve employee selection.
Internal vs. External Reliability
Internal reliability describes the consistency or how well the different steps are aligned to guide you to the desired cake. External reliability measures whether a test can be generalized beyond what it’s being used for. For example, if a student is looking to improve their grade, individual tutoring is a method that’s applicable to both mathematics and geography, even when it’s conducted by a different tutor or done in a different setting. A test diagnosing anxiety should be able to detect symptoms of anxiety in varying age groups, socio-economic status, and personality types. The higher the score on both internal and external reliability the more efficient the assessment.
Measuring the reliability variable
The Reliability Coefficient is the number quantifying the degree of consistency. With a number, we can examine why an outcome was not consistent. It may be due to a poor testing environment, the number of participants included, or test design errors. For AI video interviews, calculating the reliability coefficient helps us gauge potential errors in testing.
Test-retest reliability is typically estimated using the ICC (intraclass correlation or the intraclass correlation coefficient). In statistics, this descriptive statistic is used when quantitative measurements are based on units that are organized into groups, like the 5 dimensions of personality. The ICC quantifies how strongly units in the same group resemble each other. This is particularly appropriate for test-retest reliability.
Consider this particular example: A group of kindergartners are given a vocabulary test on August 26th and then are retested on September 5th. Given the students’ abilities gain little significant changes, both test outcomes should yield similar results. How we find the test-retest reliability coefficient is to find the correlation between the test and the retest.
The test-retest reliability is one of the most important indicators in an AI video interview. This is where HireVue's system needed improvement. Results were not consistent and a strong correlation could not be found between testing dates.
We maintain high test-retest reliability; we’ve analyzed thousands of hours of video material, teaching our AI what to look for in each dimension of the Big 5.We’ve tested and retested results from over 1000 participants, incurring 90% consistency in results. Any new evaluation results are double-checked externally and adjusted. With Retorio’s system, employee assessment results are saved, making it easily accessible for both employee and human resources manager on a secured platfrom. This provides companies a special opportunity.
What does this mean for the world of work?
Hire for job and culture fit
Psychological testing, also called psychological assessment, is a common tool for psychologists to better understand a person and their behavior. These "aptitude tests" are significantly helpful and widely used by industry leaders for understanding potential and current employees. Because it helps determine the core components of a person’s psychological or mental patterns or values, it’s useful to coordinate a team’s strengths.
Improve recruitment time- cost efficiency
An AI-powered personality assessment creates a more personalized understanding of applicants and employees. When hiring, a recruiter may better understand a person’s core values, culture fit, which adds a competitive edge to diagnosing employee-organizational fit. The average cost per hire hovers around $4000 per person; it pays dividends to pay extra attention to each person being evaluated.
Increase team performance
For current employees, if an organization regularly schedules Big 5 assessments, this kind of process can become a growth-oriented conversation. Retorio’s Big 5 assessment is not only applicable during the hiring and recruiting process self-development, but it’s already being used in team building at companies like Personio. It can easily be integrated into an employee’s development plan. If an employee wants to improve on their ability to lead a team or organize a project, the Big 5 can offer feedback on which specific areas to improve. This plan of action can be created on an employees own or in tandem with their manager or team.
If video is eating the world, AI video interview is eating how companies find the best talent. Reliability and objectivity are built into Retorio’s AI system. Companies like BMW and Lufthansa trust the process we’ve created. We’ve worked hard to design a system that ensures results are reliable and individuals are respected. Like good science, we’ll continue moving forward, continuously monitoring, developing, and testing our own product.
Popular Posts You May Like:
- History of the Big 5: Why This Online Psychometric Test Packs a
- Why a Slam-Dunk Candidate Experience Means More Money
- The Science Behind the Big 5 and How It Impacts AI