Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cuebot] Add FIFO scheduling capability #1060

Merged

Conversation

splhack
Copy link
Contributor

@splhack splhack commented Nov 4, 2021

This PR will address #1059 Frame scheduling by adding FIFO scheduling capability.
What I mean by FIFO scheduling is the oldest frame will run first if the priority is the same, which is commonly expected in job scheduling system.

Changes

  • Inject Environment to DispatcherDaoJdbc to set fifoSchedulingEnabled variable by dispatcher.fifo_scheduling_enabled property. The default value is false. So the default scheduling behavior will not change.
  • Insert ORDER BYs to FIND_JOBS_BY_SHOW/GROUP to order/sort if fifoSchedulingEnabled is true, the highest priority job first, the oldest job next.
  • Use List instead of Set to keep the job order in the SQL Query result.
  • Add unittests to make sure the default fifoSchedulingEnabled is false and the scheduling is not FIFO. If it is set to true, verify the FIFO scheduling is working.

Behavior

The FIFO scheduling behavior would not be 100% perfect. There would be multiple Cuebot instances and multiple threads to schedule frames simultaneously. It cannot guarantee the oldest frame will run first. But this FIFO scheduling is almost working as expected.

The runtime cost should be almost identical with/without FIFO scheduling since the findDispatchJobs doesn't use much jobs at the same time.

Copy link
Collaborator

@DiegoTavares DiegoTavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DiegoTavares
Copy link
Collaborator

Internally on Imageworks we use a different sorting logic, based not only on the job priority, but also on the number of cores required and the job "age" in days. We find that for bigger job queues, sorting by priority only had the potential to starve lower priority jobs. See #788 .

@splhack
Copy link
Contributor Author

splhack commented Dec 1, 2021

In our use-case, we are planning to utilize Group(Folder) for scheduling.
Team/project will have groups. Multiple priority groups per facility.
And 2 levels of quota.

  1. Hard-quota
    • Use max CPU cores and max GPU units of group to limit excessive utilization
  2. Soft-quota
    • Move jobs to lower priority group if some team/project is exceeding the soft-quota limit

We may want to add preemption (kill frames without incrementing retry count for certain duration if demand is huge)

Copy link
Collaborator

@bcipriano bcipriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@splhack
Copy link
Contributor Author

splhack commented Dec 14, 2021

merging this makes working on #1069 easier. could you merge this? @bcipriano @DiegoTavares @roulaoregan-spi

@DiegoTavares DiegoTavares merged commit 686d55e into AcademySoftwareFoundation:master Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants
  翻译: