Hướng dẫn apscheduler python example
Installing APScheduler¶The preferred installation method is by using pip: Show
$ pip install apscheduler If you don’t have pip installed, you can easily install it by downloading and running get-pip.py. If, for some reason, pip won’t work, you can manually download the APScheduler distribution from PyPI, extract and then install it: $ python setup.py install Code examples¶The source distribution contains the Basic concepts¶APScheduler has four kinds of components:
Triggers contain the scheduling logic. Each job has its own trigger which determines when the job should be run next. Beyond their initial configuration, triggers are completely stateless. Job stores house the scheduled jobs. The default job store simply keeps the jobs in memory, but others store them in various kinds of databases. A job’s data is serialized when it is saved to a persistent job store, and deserialized when it’s loaded back from it. Job stores (other than the default one) don’t keep the job data in memory, but act as middlemen for saving, loading, updating and searching jobs in the backend. Job stores must never be shared between schedulers. Executors are what handle the running of the jobs. They do this typically by submitting the designated callable in a job to a thread or process pool. When the job is done, the executor notifies the scheduler which then emits an appropriate event. Schedulers are what bind the rest together. You typically have only one scheduler running in your application. The application developer doesn’t normally deal with the job stores, executors or triggers directly. Instead, the scheduler provides the proper interface to handle all those. Configuring the job stores and executors is done through the scheduler, as is adding, modifying and removing jobs. Choosing the right scheduler, job store(s), executor(s) and trigger(s)¶Your choice of scheduler depends mostly on your programming environment and what you’ll be using APScheduler for. Here’s a quick guide for choosing a scheduler:
Simple enough, yes? To pick the appropriate job store, you need to determine whether you need job persistence or not. If you always recreate your jobs at
the start of your application, then you can probably go with the default ( Likewise, the choice of executors is usually made for you if you use one of the frameworks
above. Otherwise, the default When you schedule a job, you need to choose a trigger for it. The trigger determines the logic by which the dates/times are calculated when the job will be run. APScheduler comes with three built-in trigger types:
It is also possible to combine multiple triggers into one which fires either on times agreed on by all the participating triggers, or when any of the triggers would fire. For more information, see the documentation for You can find the plugin names of each job store, executor and trigger type on their respective API documentation pages. Configuring the scheduler¶APScheduler provides many different ways to configure the scheduler. You can use a configuration dictionary or you can pass in the options as keyword arguments. You can also instantiate the scheduler first, add jobs and configure the scheduler afterwards. This way you get maximum flexibility for any environment. The full list of scheduler level configuration options can be found on the API reference of the Let’s say you want to run BackgroundScheduler in your application with the default job store and the default executor: from apscheduler.schedulers.background import BackgroundScheduler scheduler = BackgroundScheduler() # Initialize the rest of the application here, or before the scheduler initialization This will get you a BackgroundScheduler with a MemoryJobStore named “default” and a ThreadPoolExecutor named “default” with a default maximum thread count of 10. Now, suppose you want more. You want to have two job stores using two executors and you also want to tweak the default values for new jobs and set a different timezone. The following three examples are completely equivalent, and will get you:
Method 1: from pytz import utc from apscheduler.schedulers.background import BackgroundScheduler from apscheduler.jobstores.mongodb import MongoDBJobStore from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor jobstores = { 'mongo': MongoDBJobStore(), 'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite') } executors = { 'default': ThreadPoolExecutor(20), 'processpool': ProcessPoolExecutor(5) } job_defaults = { 'coalesce': False, 'max_instances': 3 } scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc) Method 2: from apscheduler.schedulers.background import BackgroundScheduler # The "apscheduler." prefix is hard coded scheduler = BackgroundScheduler({ 'apscheduler.jobstores.mongo': { 'type': 'mongodb' }, 'apscheduler.jobstores.default': { 'type': 'sqlalchemy', 'url': 'sqlite:///jobs.sqlite' }, 'apscheduler.executors.default': { 'class': 'apscheduler.executors.pool:ThreadPoolExecutor', 'max_workers': '20' }, 'apscheduler.executors.processpool': { 'type': 'processpool', 'max_workers': '5' }, 'apscheduler.job_defaults.coalesce': 'false', 'apscheduler.job_defaults.max_instances': '3', 'apscheduler.timezone': 'UTC', }) Method 3: from pytz import utc from apscheduler.schedulers.background import BackgroundScheduler from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore from apscheduler.executors.pool import ProcessPoolExecutor jobstores = { 'mongo': {'type': 'mongodb'}, 'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite') } executors = { 'default': {'type': 'threadpool', 'max_workers': 20}, 'processpool': ProcessPoolExecutor(max_workers=5) } job_defaults = { 'coalesce': False, 'max_instances': 3 } scheduler = BackgroundScheduler() # .. do something else here, maybe add jobs etc. scheduler.configure(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc) Starting the scheduler¶Starting the scheduler is done by simply calling For BlockingScheduler, you will only want to call
Note After the scheduler has been started, you can no longer alter its settings. Adding jobs¶There are two ways to add jobs to a scheduler:
The first way is the most common way to do it. The second way is mostly a convenience to declare jobs that don’t change during the application’s run time. The
You can schedule jobs on the scheduler at any time. If the scheduler is not yet running when the job is added, the job will be scheduled tentatively and its first run time will only be computed when the scheduler starts. It is important to note that if you use an executor or job store that serializes the job, it will add a couple requirements on your job:
Of the builtin job stores, only MemoryJobStore doesn’t serialize jobs. Of the builtin executors, only ProcessPoolExecutor will serialize jobs. Important If you schedule jobs in a persistent job store during your application’s initialization, you MUST define an explicit ID for the job and use Tip To run a job immediately, omit Removing jobs¶When you remove a job from the scheduler, it is removed from its associated job store and will not be executed anymore. There are two ways to make this happen:
The latter method is probably more convenient, but it requires that you store somewhere the If the job’s schedule ends (i.e. its trigger doesn’t produce any further run times), it is automatically removed. Example: job = scheduler.add_job(myfunc, 'interval', minutes=2) job.remove() Same, using an explicit job ID: scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id') scheduler.remove_job('my_job_id') Getting a list of scheduled jobs¶To get a machine processable list of the scheduled jobs, you can use the
As a convenience, you can use the Modifying jobs¶You can modify any job attributes by calling either Example: job.modify(max_instances=6, name='Alternate name') If you want to reschedule the job – that is, change its trigger, you can use either
Example: scheduler.reschedule_job('my_job_id', trigger='cron', minute='*/5') Shutting down the scheduler¶To shut down the scheduler: By default, the scheduler shuts down its job stores and executors and waits until all currently executing jobs are finished. If you don’t want to wait, you can do: scheduler.shutdown(wait=False) This will still shut down the job stores and executors but does not wait for any running tasks to complete. Pausing/resuming job processing¶It is possible to pause the processing of scheduled jobs: This will cause the scheduler to not wake up until processing is resumed: It is also possible to start the scheduler in paused state, that is, without the first wakeup call: scheduler.start(paused=True) This is useful when you need to prune unwanted jobs before they have a chance to run. Limiting the number of concurrently executing instances of a job¶By default, only one instance of each job is allowed to be run at the same time. This means that if the job is about to be run but the previous run hasn’t finished yet, then the latest run is considered a misfire. It is possible to set the maximum number of instances for a particular job that the
scheduler will let run concurrently, by using the Missed job executions and coalescing¶Sometimes the scheduler may be unable to execute a scheduled job at the time it was scheduled to run. The most common case is when a job is
scheduled in a persistent job store and the scheduler is shut down and restarted after the job was supposed to execute. When this happens, the job is considered to have “misfired”. The scheduler will then check each missed execution time against the job’s If this behavior is undesirable for your particular use case, it is possible to use coalescing to roll all these missed executions into one. In other words, if coalescing is enabled for the job and the scheduler sees one or more queued executions for the job, it will only trigger it once. No misfire events will be sent for the “bypassed” runs. Note If the execution of a job is delayed due to no threads or processes being available in the pool, the executor may skip it due to it being run too late
(compared to its originally designated run time). If this is likely to happen in your application, you may want to either increase the number of threads/processes in the executor, or adjust the Scheduler events¶It is possible to attach event listeners to the scheduler.
Scheduler events are fired on certain occasions, and may carry additional information in them concerning the details of that particular event. It is possible to listen to only particular types of events by giving the appropriate See the documentation for the Example: def my_listener(event): if event.exception: print('The job crashed :(') else: print('The job worked :)') scheduler.add_listener(my_listener, EVENT_JOB_EXECUTED | EVENT_JOB_ERROR) Troubleshooting¶If the scheduler isn’t working as expected, it will be helpful to increase the logging level of the If you do not yet have logging enabled in the first place, you can do this: import logging logging.basicConfig() logging.getLogger('apscheduler').setLevel(logging.DEBUG) This should provide lots of useful information about what’s going on inside the scheduler. Also make sure that you check the Frequently Asked Questions section to see if your problem already has a solution. |